portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-06-18 09:53:43 +01:00

Author	SHA1	Message	Date
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	e503ef5590	Cleaned up	2016-06-07 00:11:36 +01:00
paboyle	a7682b0060	Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS	2016-06-06 23:48:21 +01:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
paboyle	5341977948	IMCI fixes. Thought I had committed these. The "real" disambiguation between std::real and Grid::real shouldn't have been necessary and I don't know why only the icpc v16.0 on babbage hits it. May need a longer term rename of Grid::real or some careful EnableIf work.	2016-04-30 03:34:16 -07:00
paboyle	1e554350ac	The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.	2016-04-29 16:49:18 -07:00
paboyle	c79ea0dcef	Fixingn IMCI	2016-04-22 21:52:54 -07:00
paboyle	ba427abde9	simd 5d	2016-04-19 15:38:39 -07:00
paboyle	9b6ab6db16	simd in 5th dimension support	2016-04-19 15:38:01 -07:00
paboyle	806a83d38b	simd in fifth dim support for dwf	2016-04-19 15:36:19 -07:00
paboyle	b1192a8908	Benchmark_zmm added	2016-04-06 03:00:07 -07:00
paboyle	e8dddb1596	Adding extra benchmark	2016-04-06 10:32:54 +01:00
paboyle	e67fc2be18	Adding a trial for openmp overhead minimisation	2016-03-31 16:00:37 +01:00
paboyle	8052556275	Cleaning up the single/double kernel implementation switch	2016-03-31 14:51:32 +01:00
paboyle	60d965f79e	AVX512 improvements; sigfpe trapping too	2016-03-30 08:42:34 +01:00
paboyle	1ecbf9794d	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-30 08:37:55 +01:00
paboyle	c77b7ee897	AddSub based alternate SU3 routine	2016-03-28 17:55:22 -06:00
paboyle	1e355a51e1	Interface change	2016-03-27 23:46:55 -07:00
paboyle	21abaf7e91	Gamma sign change	2016-03-28 00:35:45 -06:00
paboyle	165bffc2e7	Avx512 changes for assembler kernels	2016-03-26 22:25:45 -06:00
paboyle	644fd6d32e	Build avx512 clean	2016-03-25 09:35:33 -07:00
paboyle	090e7aa930	Merge remote-tracking branch 'origin/chulwoo-dec12-2015' Merge Chulwoo's Lanczos related improvements. Merge Nd!=4 fixes for pure gauge HMC from Evan.	2016-03-08 09:55:14 +00:00
paboyle	325e745daa	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-02 07:04:03 -08:00
paboyle	61413565d0	Back off the inlined spin proj as not working	2016-03-02 07:03:09 -08:00
Antonin Portelli	497e7e4c53	BG/Q compatibility fix	2016-02-23 15:57:38 +00:00
Peter Boyle	6aeaf6f568	Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then turned up problems on the BlueWaters Cray. Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel, and also to look at aggregating bigger writes for the parallel write. Not sure what the home filesystem is.	2016-02-21 08:03:21 -06:00
Jung	9f0d9ade68	Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc() Checking in before cleaning up	2016-02-20 02:50:32 -05:00
paboyle	3425751cb8	Missing return value	2016-02-19 01:06:03 +00:00
Peter Boyle	22422a84d9	Small problem in compressor fix	2016-02-17 19:03:09 -06:00
Peter Boyle	c9fadf97a5	Simplify the compressor interface again.	2016-02-17 18:16:45 -06:00
Peter Boyle	81395e85d1	Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it.	2016-02-16 13:56:44 -06:00
Peter Boyle	a0fc47c6f9	Cheaper implementation	2016-02-15 16:02:36 -06:00
paboyle	e2f73e3ead	Updates for shmem	2016-02-10 16:50:32 -08:00
neo	6371676a75	Correcting some compilation errors for clang-sse	2016-02-10 11:37:03 +09:00
Jung	bd84c23298	definitions reconciled.	2016-01-25 16:30:59 -05:00
Jung	7aa8d5e8af	Faiing to compile, comparing with master	2016-01-25 16:03:02 -05:00
Jung	6012b0ec23	Checking in changes before changing to chulwoo-dec12-2015	2016-01-25 09:40:58 -05:00
Jung	411ac49dd7	GparityWilsonTM typedef added. Not yet tested Conflicts: configure lib/qcd/action/fermion/WilsonKernels.h	2016-01-25 01:36:28 -05:00
Jung	5c57d4f403	Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2 Conflicts: lib/qcd/action/fermion/WilsonKernels.h	2016-01-11 11:36:45 -05:00
paboyle	fc6ad65751	Pushed the overlap comms tweaks	2016-01-11 06:34:22 -08:00
paboyle	dafc74020c	Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori	2016-01-10 16:54:27 -08:00
paboyle	d19321dfde	Overlap comms compute changes	2016-01-10 19:20:16 +00:00
Jung	5924e5a562	Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2 Conflicts: configure lib/qcd/action/Actions.h lib/qcd/action/fermion/WilsonKernels.h	2016-01-06 03:44:57 -05:00
paboyle	c99d748da6	Timing reports in benchmarks now reflect the asynch comms thread statistics	2016-01-04 14:42:16 +00:00
paboyle	02452afd36	Optional overlap of comms with compute	2016-01-04 14:18:40 +00:00
paboyle	331768dcff	Added overlap comms compute mode	2016-01-03 01:38:11 +00:00

1 2 3 4 5 ...

253 Commits