portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2024-11-13 01:05:36 +00:00

Author	SHA1	Message	Date
paboyle	f68b5de9c8	No compile fix on Clang	2017-08-25 19:35:21 +01:00
Peter Boyle	c3b1263e75	Benchmark prep	2017-08-25 09:25:54 +01:00
paboyle	a446d95c33	Trying to pass TeamCity and Travis	2017-08-20 01:10:50 +01:00
Peter Boyle	14d53e1c9e	Threaded MPI calls patches	2017-07-29 13:08:10 -04:00
paboyle	54e94360ad	Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit	2017-06-24 23:10:24 +01:00
Guido Cossu	3344788fa1	Merge branch 'develop' into feature/hmc_generalise	2017-05-01 12:13:56 +01:00
Peter Boyle	99220f6531	Fixes and better timing	2017-04-26 17:24:11 -04:00
Peter Boyle	fd1eb7de13	Clean implementation of the exterior faces listing only those points on the boudary	2017-04-26 02:34:52 -04:00
paboyle	ab66bac4e6	Think I'm getting on top of the reduced cost exterior precomputed list of links	2017-04-25 08:50:26 +01:00
paboyle	56277a11c8	Build a list of whats on the surface	2017-04-24 17:06:15 +01:00
Peter Boyle	e3d0e31525	Debugged assemply split phase with interior suppression	2017-04-23 19:29:27 -04:00
paboyle	b722889234	Try a better load balancing loop	2017-04-22 19:27:41 +01:00
paboyle	736bf3c866	Major rework of stencil. Half precision and MPI3 now working.	2017-04-22 11:33:50 +01:00
paboyle	fc4ab9ccd5	Working half precision comms	2017-04-20 11:20:26 +01:00
paboyle	4a340aa5ca	Massive compressor rework to support reduced precision comms	2017-04-20 09:28:27 +01:00
Guido Cossu	8c540333d5	Merge branch 'develop' into feature/hmc_generalise	2017-04-05 14:41:04 +01:00
paboyle	4b17e8eba8	Merge branch 'develop' into feature/bgq-asm Conflicts: lib/qcd/action/fermion/Fermion.h lib/qcd/action/fermion/WilsonFermion.cc lib/util/Init.cc tests/Test_cayley_even_odd_vec.cc	2017-03-28 04:49:30 -04:00
paboyle	18bde08d1b	Merge branch 'feature/staggering' into develop	2017-03-28 15:25:55 +09:00
paboyle	af230a1fb8	Average the time across the whole machine for outliers	2017-02-28 17:05:22 -05:00
paboyle	e099dcdae7	Merge branch 'develop' into feature/bgq-asm	2017-02-23 00:25:29 +00:00
paboyle	4e7ab3166f	Refactoring header layout	2017-02-22 18:09:33 +00:00
paboyle	3ae92fa2e6	Global changes to parallel_for structure. Move the comms flags to more sensible names	2017-02-21 05:24:27 -05:00
Guido Cossu	e0571c872b	Merge branch 'develop' into feature/hmc_generalise	2017-02-09 16:12:00 +00:00
paboyle	2c246551d0	Overlap comms and compute options in wilson kernels	2017-02-07 01:37:10 -05:00
Antonin Portelli	a0cfbb6e88	Merge branch 'feature/gammas' into feature/hadrons # Conflicts: # .gitignore # lib/qcd/spin/Dirac.cc # scripts/filelist	2017-01-30 09:10:49 -08:00
Antonin Portelli	fad743fbb1	Build system sanity check: corrected several headers not in the <Grid/*> format	2017-01-26 17:00:41 -08:00
Guido Cossu	17629b8d9e	Merge branch 'develop' into feature/hmc_generalise	2017-01-25 11:33:53 +00:00
Antonin Portelli	a37e71f362	New automatic implementation of gamma matrices, Meson and SeqGamma are broken	2017-01-23 19:13:43 -08:00
Peter Boyle	03c81bd902	Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm	2016-12-27 11:25:35 +00:00
Peter Boyle	a869addef1	Stats switch off	2016-12-27 11:25:22 +00:00
Peter Boyle	3d21297bbb	Call the fast path compressor for wilson kernels to avoid if else on projector	2016-12-27 11:23:13 +00:00
Peter Boyle	25efefc5b4	Back to original thread policy post test	2016-12-23 09:49:04 +00:00
Peter Boyle	b8cdb3e90a	Debug hack; raises from 62GF/s to 72 GF/s per node on BG/Q	2016-12-22 17:50:14 +00:00
azusayamaguchi	eabc577940	Assembler possibly working	2016-12-16 16:55:36 +00:00
Peter Boyle	fb8d4b2357	Lots of debug on performance Mobius	2016-12-08 17:28:28 +00:00
Guido Cossu	143c70e29f	Debugged the threaded version. Cleaning up	2016-12-07 04:40:25 +00:00
Guido Cossu	b812d5e39c	Added single threaded version of the derivative for the Ls vectorised DWF	2016-12-06 16:31:13 +00:00
Guido Cossu	ae9688e343	Reporting also the total mflops	2016-11-28 11:37:02 +00:00
Antonin Portelli	ca21003f01	Merge branch 'feature/fft-opt' into feature/feynman-rules # Conflicts: # lib/FFT.h # lib/qcd/action/fermion/WilsonFermion5D.h # tests/core/Test_fft.cc	2016-10-26 18:44:47 +01:00
azusayamaguchi	c190221fd3	Internal SHM comms in non-simd directions working Need to fix simd directions	2016-10-22 18:14:27 +01:00
azusayamaguchi	6a9eae6b6b	Reporting improvements	2016-10-21 13:36:18 +01:00
Antonin Portelli	997fd882ff	Merge branch 'develop' into feature/feynman-rules # Conflicts: # lib/Threads.h # lib/qcd/action/fermion/WilsonFermion.cc # lib/qcd/action/fermion/WilsonFermion.h # lib/qcd/utils/SUn.h # lib/simd/Grid_avx.h # lib/simd/Intel512common.h	2016-10-19 18:35:18 +01:00
azusayamaguchi	81f2aeaece	KNL streaming stores, and KNL performance coutners	2016-10-12 11:45:22 +01:00
paboyle	96f1d1b828	Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass).	2016-10-10 23:46:45 +01:00
Guido Cossu	2e453dfbf5	Added some instrumentation to benchmark the force computation	2016-10-06 17:52:45 +01:00
paboyle	4089984431	Timing hooks	2016-10-06 09:25:12 +01:00
paboyle	b6713ecb60	Momentum space rules for Overlap, DWF untested to date	2016-09-26 09:39:09 +01:00
paboyle	48fb1cdc11	Update domain 5d vectorised impl type, move the type over to 4d redblack with the dense OO inverse	2016-07-14 23:54:35 +01:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00

1 2

100 Commits