portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2025-12-13 09:14:40 +00:00

Author	SHA1	Message	Date
Guido Cossu	ffedeb1c58	Minor modifications	2016-07-06 11:41:27 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	e503ef5590	Cleaned up	2016-06-07 00:11:36 +01:00
paboyle	a7682b0060	Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS	2016-06-06 23:48:21 +01:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
paboyle	5341977948	IMCI fixes. Thought I had committed these. The "real" disambiguation between std::real and Grid::real shouldn't have been necessary and I don't know why only the icpc v16.0 on babbage hits it. May need a longer term rename of Grid::real or some careful EnableIf work.	2016-04-30 03:34:16 -07:00
paboyle	1e554350ac	The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.	2016-04-29 16:49:18 -07:00
paboyle	c79ea0dcef	Fixingn IMCI	2016-04-22 21:52:54 -07:00
paboyle	9b6ab6db16	simd in 5th dimension support	2016-04-19 15:38:01 -07:00
paboyle	806a83d38b	simd in fifth dim support for dwf	2016-04-19 15:36:19 -07:00
paboyle	b1192a8908	Benchmark_zmm added	2016-04-06 03:00:07 -07:00
paboyle	e8dddb1596	Adding extra benchmark	2016-04-06 10:32:54 +01:00
paboyle	e67fc2be18	Adding a trial for openmp overhead minimisation	2016-03-31 16:00:37 +01:00
paboyle	8052556275	Cleaning up the single/double kernel implementation switch	2016-03-31 14:51:32 +01:00
paboyle	60d965f79e	AVX512 improvements; sigfpe trapping too	2016-03-30 08:42:34 +01:00
paboyle	1ecbf9794d	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-30 08:37:55 +01:00
paboyle	c77b7ee897	AddSub based alternate SU3 routine	2016-03-28 17:55:22 -06:00
paboyle	1e355a51e1	Interface change	2016-03-27 23:46:55 -07:00
paboyle	21abaf7e91	Gamma sign change	2016-03-28 00:35:45 -06:00
paboyle	165bffc2e7	Avx512 changes for assembler kernels	2016-03-26 22:25:45 -06:00
paboyle	644fd6d32e	Build avx512 clean	2016-03-25 09:35:33 -07:00
paboyle	090e7aa930	Merge remote-tracking branch 'origin/chulwoo-dec12-2015' Merge Chulwoo's Lanczos related improvements. Merge Nd!=4 fixes for pure gauge HMC from Evan.	2016-03-08 09:55:14 +00:00
paboyle	325e745daa	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-02 07:04:03 -08:00
paboyle	61413565d0	Back off the inlined spin proj as not working	2016-03-02 07:03:09 -08:00
Antonin Portelli	497e7e4c53	BG/Q compatibility fix	2016-02-23 15:57:38 +00:00
Peter Boyle	6aeaf6f568	Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then turned up problems on the BlueWaters Cray. Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel, and also to look at aggregating bigger writes for the parallel write. Not sure what the home filesystem is.	2016-02-21 08:03:21 -06:00
paboyle	3425751cb8	Missing return value	2016-02-19 01:06:03 +00:00
Peter Boyle	22422a84d9	Small problem in compressor fix	2016-02-17 19:03:09 -06:00
Peter Boyle	c9fadf97a5	Simplify the compressor interface again.	2016-02-17 18:16:45 -06:00
Peter Boyle	81395e85d1	Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it.	2016-02-16 13:56:44 -06:00
Peter Boyle	a0fc47c6f9	Cheaper implementation	2016-02-15 16:02:36 -06:00
paboyle	e2f73e3ead	Updates for shmem	2016-02-10 16:50:32 -08:00
neo	6371676a75	Correcting some compilation errors for clang-sse	2016-02-10 11:37:03 +09:00
Jung	bd84c23298	definitions reconciled.	2016-01-25 16:30:59 -05:00
Jung	7aa8d5e8af	Faiing to compile, comparing with master	2016-01-25 16:03:02 -05:00
Jung	6012b0ec23	Checking in changes before changing to chulwoo-dec12-2015	2016-01-25 09:40:58 -05:00
Jung	411ac49dd7	GparityWilsonTM typedef added. Not yet tested Conflicts: configure lib/qcd/action/fermion/WilsonKernels.h	2016-01-25 01:36:28 -05:00
Jung	5c57d4f403	Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2 Conflicts: lib/qcd/action/fermion/WilsonKernels.h	2016-01-11 11:36:45 -05:00

1 2 3 4

155 Commits