1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-15 10:15:36 +00:00
Commit Graph

430 Commits

Author SHA1 Message Date
Lanny91
d2003f24f4 Corrected incorrect usage of ExtractSlice for conserved current code. 2017-04-26 17:25:28 +01:00
Peter Boyle
f8797e1e3e bug fix. works now and great face performance 2017-04-26 03:14:02 -04:00
Peter Boyle
fd1eb7de13 Clean implementation of the exterior faces listing only those points on the boudary 2017-04-26 02:34:52 -04:00
Peter Boyle
2ce898efa3 Pretty code 2017-04-26 02:34:25 -04:00
Lanny91
44260643f6 First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions. 2017-04-25 18:00:24 +01:00
paboyle
ab66bac4e6 Think I'm getting on top of the reduced cost exterior precomputed list of links 2017-04-25 08:50:26 +01:00
paboyle
56277a11c8 Build a list of whats on the surface 2017-04-24 17:06:15 +01:00
Guido Cossu
752048f410 Merge branch 'develop' into feature/clover 2017-04-24 14:41:20 +01:00
Peter Boyle
5b55867a7a Slightly cheaper Ext assembly 2017-04-24 05:36:11 -04:00
Peter Boyle
3accb1ef89 Debugged assemply split phase with interior suppression 2017-04-23 19:30:19 -04:00
Peter Boyle
e3d0e31525 Debugged assemply split phase with interior suppression 2017-04-23 19:29:27 -04:00
Peter Boyle
5812eb8a8c Partially fixed. But the comms-overlap does not work yet. 2017-04-22 18:50:25 -04:00
paboyle
ac58565d0a Dangerous rewrite of the assembly. If I make a mistake the debug will be painful. 2017-04-22 19:31:04 +01:00
paboyle
b722889234 Try a better load balancing loop 2017-04-22 19:27:41 +01:00
paboyle
abba44a837 Hand unrolled for overlapped comms 2017-04-22 17:45:17 +01:00
paboyle
f301be94ce Fixed 2017-04-22 17:42:31 +01:00
Peter Boyle
1d1b225497 Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). 2017-04-22 09:05:28 -04:00
Peter Boyle
53a785a3dd Fixing the KNL compile 2017-04-22 08:11:51 -04:00
paboyle
736bf3c866 Major rework of stencil. Half precision and MPI3 now working. 2017-04-22 11:33:50 +01:00
paboyle
fc4ab9ccd5 Working half precision comms 2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca Massive compressor rework to support reduced precision comms 2017-04-20 09:28:27 +01:00
Guido Cossu
b694996302 adding comments 2017-04-14 13:30:14 +01:00
a6a0da873f Merge branch 'feature/hadrons' into feature/qed-fvol 2017-04-13 15:31:06 +01:00
paboyle
42fb49d3fd Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-04-13 14:12:47 +01:00
8ef4300412 spurious .dirstamp files removed 2017-04-10 17:00:22 +01:00
paboyle
db5f6d3ae3 Verbose fix 2017-04-09 23:41:30 +09:00
paboyle
86aaa35294 Christoph needs SchurDiagTwoKappa which is mobius specific. 2017-04-07 11:07:40 +09:00
Guido Cossu
8c540333d5 Merge branch 'develop' into feature/hmc_generalise 2017-04-05 14:41:04 +01:00
Guido Cossu
6fd82228bf Working on the derivative 2017-04-05 10:51:44 +01:00
Guido Cossu
ca6efc685e Merge branch 'develop' into feature/clover 2017-04-04 10:19:02 +01:00
paboyle
1c4bc7ed38 Debugged staggered conventions 2017-03-31 14:41:48 +09:00
Guido Cossu
b8ae787b5e Correcting a simple typo 2017-03-30 11:33:15 +01:00
Guido Cossu
fbe2c3b5f9 ]Merge branch 'develop' into feature/clover 2017-03-30 11:18:31 +01:00
Guido Cossu
1ed69816b9 First steps for the force term 2017-03-30 11:14:27 +01:00
paboyle
9fd23faadf Pretty layout 2017-03-30 13:44:45 +09:00
paboyle
10e4fa0dc8 Template instantiation improvements 2017-03-30 13:44:25 +09:00
paboyle
c4aca1dde4 Conjugate coefficients on adjoint 2017-03-30 13:44:05 +09:00
paboyle
b9e8ea3aaa conjugate coefficient on the dagger 2017-03-30 13:43:13 +09:00
paboyle
077aa728b9 Fix the ZMobius (I think) 2017-03-30 13:42:09 +09:00
paboyle
a8d83d886e Macro controls 2017-03-30 13:31:34 +09:00
paboyle
7fd46eeec4 Trailing whitespace removal 2017-03-30 13:31:10 +09:00
paboyle
2b115929dc Small AVX512 asm ifdef patch 2017-03-29 18:51:23 +09:00
paboyle
d805867e02 Better init 2017-03-28 13:25:05 -04:00
paboyle
98f9318279 Build on AVX2 and MPI passing with clang++ 2017-03-28 23:16:04 +09:00
paboyle
4b17e8eba8 Merge branch 'develop' into feature/bgq-asm
Conflicts:
	lib/qcd/action/fermion/Fermion.h
	lib/qcd/action/fermion/WilsonFermion.cc
	lib/util/Init.cc
	tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
paboyle
18bde08d1b Merge branch 'feature/staggering' into develop 2017-03-28 15:25:55 +09:00
Guido Cossu
5e549ebd8b Adding force terms 2017-03-27 16:43:15 +09:00
Guido Cossu
fff484eca5 Populating Clover fermions methods 2017-03-27 15:12:57 +09:00
Guido Cossu
5fdc05782b More in the clover fermion class 2017-03-27 10:54:16 +09:00
Guido Cossu
a04eb7df5d Starting Clover term 2017-03-24 12:43:28 +09:00
paboyle
e7c36771ed ZMobius prep for asm 2017-03-15 14:23:33 -04:00
paboyle
8dc57a1e25 Layout change 2017-03-13 11:11:46 +00:00
paboyle
f57bd770b0 Merge branch 'bugfix/dminus' into feature/bgq-asm 2017-03-13 11:11:03 +00:00
Chulwoo Jung
33edde245d Changing Dminus(Dag) to use full vectors to work correctly 2017-03-12 23:02:42 -04:00
paboyle
447c5e6cd7 Z mobius hermiticity correction 2017-03-13 01:30:43 +00:00
paboyle
8b99d80d8c Merge branch 'bgq-asm-shmemfixes' into feature/bgq-asm 2017-03-12 23:30:09 +00:00
paboyle
af230a1fb8 Average the time across the whole machine for outliers 2017-02-28 17:05:22 -05:00
Christopher Kelly
06a132e3f9 Fixes to SHMEM comms 2017-02-28 13:31:54 -08:00
paboyle
e099dcdae7 Merge branch 'develop' into feature/bgq-asm 2017-02-23 00:25:29 +00:00
paboyle
4e7ab3166f Refactoring header layout 2017-02-22 18:09:33 +00:00
azusayamaguchi
1c30e9a961 Verified 2017-02-21 23:01:25 +00:00
azusayamaguchi
bf7e3f20d4 Staggaered fermion optimised version 2017-02-21 14:35:42 +00:00
paboyle
3ae92fa2e6 Global changes to parallel_for structure.
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
Guido Cossu
e0571c872b Merge branch 'develop' into feature/hmc_generalise 2017-02-09 16:12:00 +00:00
paboyle
2c246551d0 Overlap comms and compute options in wilson kernels 2017-02-07 01:37:10 -05:00
a0cfbb6e88 Merge branch 'feature/gammas' into feature/hadrons
# Conflicts:
#	.gitignore
#	lib/qcd/spin/Dirac.cc
#	scripts/filelist
2017-01-30 09:10:49 -08:00
fad743fbb1 Build system sanity check: corrected several headers not in the <Grid/*> format 2017-01-26 17:00:41 -08:00
Guido Cossu
17629b8d9e Merge branch 'develop' into feature/hmc_generalise 2017-01-25 11:33:53 +00:00
a37e71f362 New automatic implementation of gamma matrices, Meson and SeqGamma are broken 2017-01-23 19:13:43 -08:00
Guido Cossu
27dfe816fa Added TwoFlavorsEO
Had to remove a conformability check in the Derivative of SchurDiff,
see the comments in the file
2017-01-20 16:59:31 +00:00
Peter Boyle
03c81bd902 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2016-12-27 11:25:35 +00:00
Peter Boyle
a869addef1 Stats switch off 2016-12-27 11:25:22 +00:00
Peter Boyle
3d21297bbb Call the fast path compressor for wilson kernels to avoid if else on projector 2016-12-27 11:23:13 +00:00
Peter Boyle
25efefc5b4 Back to original thread policy post test 2016-12-23 09:49:04 +00:00
Peter Boyle
eabf316ed9 BGQ performance ASM 2016-12-22 21:56:08 +00:00
Peter Boyle
04ae7929a3 BGQ or KNL assembler now 2016-12-22 17:53:22 +00:00
Peter Boyle
caba0d42a5 L1p controls 2016-12-22 17:52:55 +00:00
Peter Boyle
9ae81c06d2 L1p controls for BG/Q 2016-12-22 17:52:21 +00:00
Peter Boyle
b8cdb3e90a Debug hack; raises from 62GF/s to 72 GF/s per node on BG/Q 2016-12-22 17:50:14 +00:00
paboyle
3e6945cd65 Fixing AVX Z-mobius 2016-12-18 02:05:11 +00:00
paboyle
87be03006a AVX 512 code broke other compiles; fixing 2016-12-18 01:45:09 +00:00
Peter Boyle
4d8b01b7ed Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-12-18 00:56:57 +00:00
Peter Boyle
fa6acccf55 Zmobius asm 2016-12-18 00:56:19 +00:00
azusayamaguchi
df9108154d Debugged 2 versions of assembler; ls vectorised, xyzt vectorised 2016-12-17 23:47:51 +00:00
azusayamaguchi
b3e7f600da Partial implementation of 4d vectorisation assembler 2016-12-16 23:50:30 +00:00
azusayamaguchi
d4071daf2a Template specialise 2016-12-16 22:28:29 +00:00
azusayamaguchi
a2a6329094 AVX512 only for ASM compilation 2016-12-16 22:03:29 +00:00
azusayamaguchi
eabc577940 Assembler possibly working 2016-12-16 16:55:36 +00:00
91e98b1dd5 Merge branch 'feature/hadrons' into develop 2016-12-15 18:15:56 +00:00
Guido Cossu
2fb92dbc6e Cleaning up previous debug lines 2016-12-13 07:53:43 +00:00
Guido Cossu
5c74b6028b Commit for debugging, lot of IO 2016-12-13 06:35:30 +00:00
Azusa Yamaguchi
426197e446 Nc=3 2016-12-12 09:10:54 +00:00
Azusa Yamaguchi
99e2c1e666 Kernels options 2016-12-12 09:08:53 +00:00
Azusa Yamaguchi
1440565a10 Decrease verbosity 2016-12-12 09:08:04 +00:00
Peter Boyle
fe187e9ed3 Compiles and passes under ZMobius with assembler 2016-12-10 00:47:48 +00:00
Peter Boyle
0091b50f49 Zmobius working -- not asm yet 2016-12-09 22:51:32 +00:00
Peter Boyle
fb8d4b2357 Lots of debug on performance Mobius 2016-12-08 17:28:28 +00:00
Guido Cossu
2bd4233919 Completed testing of the HMC for Ls vectorised version (on AVX2) 2016-12-07 04:56:37 +00:00
Guido Cossu
143c70e29f Debugged the threaded version. Cleaning up 2016-12-07 04:40:25 +00:00
Guido Cossu
b812d5e39c Added single threaded version of the derivative for the Ls vectorised DWF 2016-12-06 16:31:13 +00:00
Peter Boyle
e27c6b217c Updating 2016-12-01 12:42:53 +00:00
paboyle
6adf35da54 Faster Mobius 2016-12-01 11:39:04 +00:00
paboyle
bd0430b34f Serialisation in malloc fixed 2016-11-29 22:27:55 +00:00
Azusa Yamaguchi
c097fd041a Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-29 13:44:17 +00:00
Azusa Yamaguchi
77fb25fb29 Push 5d tests 2016-11-29 13:43:56 +00:00
Azusa Yamaguchi
389e0a77bd Staggerd Fermion 5D 2016-11-29 13:13:56 +00:00
Guido Cossu
ae9688e343 Reporting also the total mflops 2016-11-28 11:37:02 +00:00
fabcd4179d Hadrons: propagator type coming from the fermion implementation 2016-11-28 14:02:10 +09:00
Azusa Yamaguchi
668ca57702 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-22 13:49:11 +00:00
azusayamaguchi
f7b60004f3 Merge branch 'develop' into release/v0.6.0 2016-11-04 16:08:07 +00:00
azusayamaguchi
b7d55f7dfb Fix a typo in reorg of the --dslash-asm 2016-11-04 11:35:08 +00:00
Azusa Yamaguchi
ee686a7d85 Compiles now 2016-11-03 16:58:23 +00:00
Azusa Yamaguchi
1c5b7a6be5 Staggered phases first cut, c1, c2, u0 2016-11-03 16:26:56 +00:00
75bbf6a0af Merge branch 'develop' into feature/feynman-rules 2016-11-03 13:52:11 +00:00
paboyle
c067051d5f Merge branch 'develop' into release/v0.6.0 2016-11-02 13:59:18 +00:00
Guido Cossu
ae8561892e Eliminating useless defines 2016-11-02 10:21:06 +00:00
paboyle
bb94ddd0eb Tidy up of mpi3; also some cleaning of the dslash controls. 2016-11-02 08:07:09 +00:00
Azusa Yamaguchi
164d3691db Staggered 2016-11-01 14:24:22 +00:00
Guido Cossu
e8c3174ae2 Small change in the defines 2016-10-30 12:23:11 +00:00
Guido Cossu
9b066e94d0 Compilation with both single and double precision 2016-10-30 12:04:06 +00:00
Guido Cossu
e1042aef77 First version of the doube prec for testing purposes
It does not compile single and double version at the same time
2016-10-28 17:20:04 +01:00
ca21003f01 Merge branch 'feature/fft-opt' into feature/feynman-rules
# Conflicts:
#	lib/FFT.h
#	lib/qcd/action/fermion/WilsonFermion5D.h
#	tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
azusayamaguchi
6a9eae6b6b Reporting improvements 2016-10-21 13:36:18 +01:00
bd6a228af6 Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules 2016-10-21 13:10:30 +01:00
paboyle
b58adc6a4b commVector 2016-10-20 17:00:15 +01:00
997fd882ff Merge branch 'develop' into feature/feynman-rules
# Conflicts:
#	lib/Threads.h
#	lib/qcd/action/fermion/WilsonFermion.cc
#	lib/qcd/action/fermion/WilsonFermion.h
#	lib/qcd/utils/SUn.h
#	lib/simd/Grid_avx.h
#	lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
azusayamaguchi
81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
paboyle
3619167d62 Mass parameter 2016-10-10 23:47:33 +01:00
paboyle
96f1d1b828 Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass). 2016-10-10 23:46:45 +01:00
paboyle
657e0a8f4d Mass parameter 2016-10-10 23:46:10 +01:00
paboyle
616e7cd83e Mass parameter 2016-10-10 23:45:48 +01:00
paboyle
6f26d2e8d4 Overlap tree level feynman rule 2016-10-10 23:45:18 +01:00
paboyle
c014574504 A "please implement me" feynman rule. If this were abstract virtual it would
require/force implementation
2016-10-10 23:44:00 +01:00
paboyle
d7ce164e6e Feynman rule for DWF 2016-10-10 23:43:36 +01:00
paboyle
c0d5b99016 Dminus 2016-10-10 23:43:19 +01:00
paboyle
09ca32d678 Dminus added for Cayley 2016-10-10 23:42:55 +01:00
Guido Cossu
b56c9ffa52 Fix for AVXFMA 2016-10-10 14:43:37 +01:00
Guido Cossu
2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle
4089984431 Timing hooks 2016-10-06 09:25:12 +01:00
Guido Cossu
c78bbd0f8c Fix ASM compilation 2016-10-04 15:37:32 +01:00
paboyle
b6713ecb60 Momentum space rules for Overlap, DWF untested to date 2016-09-26 09:39:09 +01:00
Guido Cossu
b6597b74e7 Added support for the Two index Symmetric and Antisymmetric representations
Tested for HMC convergence: OK
Added also a test file showing an example for mixed representations
2016-09-22 14:17:37 +01:00
Guido Cossu
b9c80318a2 Merge branch 'develop' into feature/hirep 2016-09-13 10:01:51 +01:00
Guido Cossu
f76f281e58 Cleaning files after fix 2016-09-09 11:34:25 +01:00
Guido Cossu
aa20cc8b52 Fixing compilation error with AVX512 flag 2016-09-09 02:58:52 -07:00
Guido Cossu
0fd179fb33 Merge branch 'develop' into feature/hirep 2016-09-01 12:59:53 +01:00
paboyle
b573d1f35a Wilson tree level added 2016-08-31 00:27:04 +01:00
paboyle
0c1d7e4daf Mom space prop for Wilson action 2016-08-31 00:26:36 +01:00
paboyle
02e983a0cd Momentum space prop and free prop convolution 2016-08-31 00:26:02 +01:00