1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-09-20 17:25:37 +01:00
Commit Graph

1606 Commits

Author SHA1 Message Date
a8843c9af6 Code cleaning, the fermion implementation can be sepcified using the macro FIMPL 2016-11-27 16:47:22 +09:00
7a1a7a685e Merge branch 'feature/fft-opt' into feature/hadrons 2016-11-27 15:32:03 +09:00
Lanny91
b18950f776 Added simd real divide test with QPX divide fixes 2016-11-25 13:21:33 +00:00
Lanny91
0acbf77bc6 Add QPX Div structure 2016-11-24 13:24:12 +00:00
5833f247fa more FFt optimisations 2016-11-24 09:09:48 +09:00
a2cffb0304 AVXFMA target fixed 2016-11-21 17:47:18 +01:00
97cddda49e Merge branch 'feature/gen-simd' into feature/doxygen
# Conflicts:
#	Makefile.am
#	configure.ac
2016-11-19 13:11:13 +01:00
b873504b90 fully generic SIMD 2016-11-19 01:32:39 +01:00
Guido Cossu
62749d05a6 Naming the scalar action 2016-11-17 12:26:20 +00:00
Guido Cossu
3834feb4b7 Adding action names 2016-11-16 16:46:49 +00:00
042ae5b87c generic 256bits SIMD 2016-11-15 12:16:15 +00:00
Guido Cossu
a783282b8b Merge branch 'develop' into feature/hmc_generalise 2016-11-10 18:13:07 +00:00
paboyle
604f0ea2f6 Merge branch 'develop' into release/v0.6.0 2016-11-09 04:13:01 -08:00
paboyle
33dc1f51b5 Final sign off commits from Cori-1 2016-11-09 04:11:03 -08:00
13a8997789 Merge branch 'release/v0.6.0' into feature/hadrons
# Conflicts:
#	Makefile.am
2016-11-08 20:43:39 +00:00
9576f0903d namespace fix 2016-11-08 19:07:47 +00:00
8a5e3a917c Merge branch 'develop' into release/v0.6.0
# Conflicts:
#	tests/core/Test_fft_gfix.cc
2016-11-08 16:53:42 +00:00
3d2a22a14d include fix for MKL 2016-11-08 15:31:47 +00:00
azusayamaguchi
f85b35314d Fix a routine for single node processor coor from rank 2016-11-08 11:49:13 +00:00
azusayamaguchi
0cff8754d1 Usecs 2016-11-08 11:35:41 +00:00
azusayamaguchi
692b44dac1 Merge branch 'develop' into release/v0.6.0 2016-11-04 22:48:11 +00:00
azusayamaguchi
96ba42a297 omm buf 2016-11-04 22:47:25 +00:00
azusayamaguchi
f7b60004f3 Merge branch 'develop' into release/v0.6.0 2016-11-04 16:08:07 +00:00
ad971ca07b fftw3.h is now expected to be an external header 2016-11-04 13:12:35 +00:00
f2f16eb972 fftw3.h removed, please don't commit this file back 2016-11-04 13:11:05 +00:00
azusayamaguchi
b7d55f7dfb Fix a typo in reorg of the --dslash-asm 2016-11-04 11:35:08 +00:00
azusayamaguchi
6e548a8ad5 Linux compile needed 2016-11-04 11:34:16 +00:00
a5dd4a9bab Merge branch 'feature/fft-opt' into develop 2016-11-03 14:34:46 +00:00
ec232af851 Photon.h references removed 2016-11-03 14:34:16 +00:00
17e30281e9 Merge branch 'develop' into feature/fft-opt
# Conflicts:
#	lib/FFT.h
2016-11-03 14:14:03 +00:00
aee44dc694 Photon.h removed from develop branch 2016-11-03 13:54:15 +00:00
75bbf6a0af Merge branch 'develop' into feature/feynman-rules 2016-11-03 13:52:11 +00:00
paboyle
111bfbc6bc notimestamp by default 2016-11-03 11:40:26 +00:00
paboyle
f41a230b32 Decrease mpi3l verbose 2016-11-02 19:54:03 +00:00
paboyle
c067051d5f Merge branch 'develop' into release/v0.6.0 2016-11-02 13:59:18 +00:00
paboyle
9e2ec2719b Merge branch 'develop' into feature/mpi3-master-slave 2016-11-02 13:02:56 +00:00
paboyle
757a928f9a Improvement to use own SHM_OPEN call to avoid openmpi bug. 2016-11-02 12:37:46 +00:00
Guido Cossu
bc248b6948 Merge branch 'release/v0.6.0' into feature/KNL_double_prec
Conflicts:
	lib/simd/Grid_avx512.h
2016-11-02 10:40:49 +00:00
Guido Cossu
ae8561892e Eliminating useless defines 2016-11-02 10:21:06 +00:00
paboyle
32375aca65 Semaphore sleep/wake up on remote processes. 2016-11-02 09:27:20 +00:00
paboyle
bb94ddd0eb Tidy up of mpi3; also some cleaning of the dslash controls. 2016-11-02 08:07:09 +00:00
James Harrison
7f0fc0eff5 Remove explicit use of double-precision types in photon.h 2016-11-01 16:02:35 +00:00
paboyle
791cb050c8 Comms improvements 2016-11-01 11:35:43 +00:00
d5e95bc350 Merge branch 'release/v0.6.0' into feature/feynman-rules 2016-10-31 18:36:21 +00:00
7a84906b5f Merge branch 'release/v0.6.0' into feature/fft-opt 2016-10-31 18:31:49 +00:00
66d832c733 FFTW header fix 2016-10-31 16:39:29 +00:00
e74417ca12 big build system polish 2016-10-31 16:31:27 +00:00
Guido Cossu
e8c3174ae2 Small change in the defines 2016-10-30 12:23:11 +00:00
Guido Cossu
9b066e94d0 Compilation with both single and double precision 2016-10-30 12:04:06 +00:00
James Harrison
618abdf302 Add missing volume factor in stochastic QED field 2016-10-29 11:04:02 +01:00
Guido Cossu
e1042aef77 First version of the doube prec for testing purposes
It does not compile single and double version at the same time
2016-10-28 17:20:04 +01:00
paboyle
aa6a839c60 avx512 build fix; detect clang/gcc intrinsics vs. ICPC 2016-10-28 09:13:09 +01:00
b4d2af8c89 threaded FFT 2016-10-26 19:46:36 +01:00
434af6aeaa Merge branch 'develop' into feature/fft-opt 2016-10-26 18:50:38 +01:00
e90f8ac841 Merge branch 'develop' into feature/feynman-rules 2016-10-26 18:50:21 +01:00
a1705a8d53 debug message removed 2016-10-26 18:50:07 +01:00
ca21003f01 Merge branch 'feature/fft-opt' into feature/feynman-rules
# Conflicts:
#	lib/FFT.h
#	lib/qcd/action/fermion/WilsonFermion5D.h
#	tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
14ddf2c234 more FFT optimisations 2016-10-26 17:36:26 +01:00
Guido Cossu
1d666771f9 Debugging the RNG, eliminate the barrier after broadcast 2016-10-26 16:08:23 +01:00
Guido Cossu
d50055cd96 Making the ILDG support optional 2016-10-26 09:48:01 +01:00
Azusa Yamaguchi
bca861e112 Note:FFT shoud be GridFFT (Not change yet).
Gauge fix with FFt is added (tests/core)
2016-10-25 14:21:48 +01:00
33d199a0ad temporary thread safety in FFT 2016-10-25 12:56:40 +01:00
paboyle
b820076b91 Merge branch 'develop' into feature/mpi3 2016-10-25 06:02:33 +01:00
paboyle
09f66100d3 MPI 3 compile on non-linux 2016-10-25 06:01:12 +01:00
azusayamaguchi
d7d92af09d Travis fail fix attempt 2016-10-25 01:45:53 +01:00
azusayamaguchi
460d0753a1 Merge branch 'develop' into feature/mpi3
Conflicts:
	lib/simd/Grid_avx512.h
2016-10-25 01:08:51 +01:00
azusayamaguchi
8f8058f8a5 More random bits on parallel seeding 2016-10-25 01:05:52 +01:00
azusayamaguchi
d97a27f483 Verbose 2016-10-25 01:05:31 +01:00
azusayamaguchi
7c3363b91e Compiles all comms targets 2016-10-25 00:04:17 +01:00
azusayamaguchi
b94478fa51 mpi, mpi3, shmem all compile.
mpi, mpi3 pass single node multi-rank
2016-10-24 23:45:31 +01:00
Guido Cossu
47c7159177 ILDG reader/writer works
Fill the xml header with the required information, todo.
2016-10-24 21:57:54 +01:00
13bf0482e3 FFT optimisation 2016-10-24 19:25:40 +01:00
a795b5705e memory optimisation 2016-10-24 19:25:15 +01:00
392e064513 fast local peek-poke 2016-10-24 19:24:21 +01:00
azusayamaguchi
b6a65059a2 Update to use shared memory to contain the stencil comms buffers
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
Guido Cossu
f415db583a Adding ILDG format 2016-10-24 15:48:22 +01:00
Guido Cossu
f55c16f984 Adding a barrier in the RNG save 2016-10-24 11:02:14 +01:00
azusayamaguchi
ea25a4d9ac Works 2016-10-23 06:10:05 +01:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
Guido Cossu
df67e013ca More debug output for the RNG 2016-10-22 13:34:17 +01:00
Guido Cossu
3e990c9d0a Reverting the broadcast change 2016-10-22 13:26:43 +01:00
Guido Cossu
4b740fc8fd Debugging the RNG state save 2016-10-22 13:06:00 +01:00
azusayamaguchi
0fcd2e7188 Simplify the comms structure prior to implementing Shared memory direct bouncs 2016-10-21 22:44:10 +01:00
azusayamaguchi
910b8dd6a1 use simd type 2016-10-21 22:35:29 +01:00
azusayamaguchi
75ebd3a0d1 Typo fixes and rotate for CLANG 2016-10-21 22:34:29 +01:00
Guido Cossu
cccd14b09e Small cleanup 2016-10-21 17:20:54 +01:00
Guido Cossu
e6acffdfc2 Fixing the plaquette computation 2016-10-21 16:06:34 +01:00
7c8f79b147 more stochastic QED fixes 2016-10-21 15:20:12 +01:00
azusayamaguchi
09fd5c43a7 Reasonably fast version 2016-10-21 15:17:39 +01:00
462921e549 QED: fix stochastic field 2016-10-21 14:41:08 +01:00
Guido Cossu
392130a537 Working on the 5d 2016-10-21 14:22:25 +01:00
azusayamaguchi
f22317748f Merge branch 'feature/mpi3' of https://github.com/paboyle/Grid into feature/mpi3 2016-10-21 13:36:35 +01:00
azusayamaguchi
6a9eae6b6b Reporting improvements 2016-10-21 13:36:18 +01:00
azusayamaguchi
fad96cf250 StencilBufs 2016-10-21 13:36:00 +01:00
azusayamaguchi
f331809c27 Use variable type for loop 2016-10-21 13:35:37 +01:00
bd6a228af6 Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules 2016-10-21 13:10:30 +01:00
63d219498b first (dirty) implementation of Feynman stoctachtic EM field 2016-10-21 13:10:13 +01:00
paboyle
2c54a53d0a Compile verbose reduce 2016-10-21 12:12:14 +01:00
paboyle
306160ad9a bcopy threaded 2016-10-21 12:07:28 +01:00
azusayamaguchi
20a091c3ed Intel vs. Clang intrinsics differences absorbed 2016-10-21 09:08:36 +01:00
azusayamaguchi
202078eb1b Cray / OpenSHMEM ordering differs 2016-10-21 09:07:20 +01:00
paboyle
a762b1fb71 MPI3 working with a bounce through shared memory on my laptop.
Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node.
2016-10-21 09:03:26 +01:00
Guido Cossu
deef2673b2 Separating the Lattice theories stub from the QCD.h file 2016-10-20 17:24:08 +01:00
paboyle
5b5925b8e5 Forgot to add 2016-10-20 17:09:40 +01:00
Guido Cossu
977b0a6dd9 Merge branch 'develop' into feature/hmc_generalise 2016-10-20 17:04:41 +01:00
Guido Cossu
977d844394 Few modifications on stdout messages 2016-10-20 17:01:59 +01:00
paboyle
b58adc6a4b commVector 2016-10-20 17:00:15 +01:00
paboyle
f9d5e95d72 allocator template typedefs moved to AlignedAllocator 2016-10-20 16:59:39 +01:00
paboyle
4f8e636a43 commVector 2016-10-20 16:59:16 +01:00
paboyle
9b39f35ae6 commVector different for SHMEM compat 2016-10-20 16:58:53 +01:00
paboyle
5fe2b85cbd MPI3 and shared memory support 2016-10-20 16:58:01 +01:00
paboyle
c7cccaaa69 Comm vector for shmem 2016-10-20 16:57:31 +01:00
paboyle
cbcfea466f MPI3 2016-10-20 16:57:14 +01:00
paboyle
4955672fc3 MPI3 2016-10-20 16:57:00 +01:00
paboyle
8c043da5b7 SHMEM and comms allocator made different 2016-10-20 16:56:05 +01:00
paboyle
3cbe974eb4 Layout 2016-10-20 16:55:21 +01:00
997fd882ff Merge branch 'develop' into feature/feynman-rules
# Conflicts:
#	lib/Threads.h
#	lib/qcd/action/fermion/WilsonFermion.cc
#	lib/qcd/action/fermion/WilsonFermion.h
#	lib/qcd/utils/SUn.h
#	lib/simd/Grid_avx.h
#	lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
Guido Cossu
590675e2ca Csum in hex format 2016-10-19 17:26:25 +01:00
Guido Cossu
8c65bdf6d3 Printing checksum for the RNG file 2016-10-19 16:56:11 +01:00
Guido Cossu
74f1ed3bc5 Adding some documentation for HMC 2016-10-19 10:51:13 +01:00
paboyle
7af9b87318 Cache face tables to improve performance.
Extract merge now looking poor.
2016-10-18 09:51:37 +01:00
paboyle
811ca45473 GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support 2016-10-17 16:23:21 +01:00
paboyle
bc1a4d40ba Faster integer handling avoid push_back 2016-10-17 16:16:44 +01:00
Guido Cossu
e250e6b7bb Moving parameters outside of the HMCrunner 2016-10-14 17:22:32 +01:00
paboyle
c8079e6621 Time the face gateher in x-dir more carefully 2016-10-13 22:28:50 +01:00
azusayamaguchi
8b0d171c9a 32bit issue on the KNL code variant where byte offsets were stored 2016-10-12 17:49:32 +01:00
azusayamaguchi
8bbd9ebc27 Reversing changes to Stencil class 2016-10-12 13:47:20 +01:00
azusayamaguchi
6472b431f0 __rdpmc needed for gcc, clang++ 2016-10-12 12:29:08 +01:00
azusayamaguchi
bd205a3293 Fixing for non x86 and non KNL 2016-10-12 12:09:15 +01:00
azusayamaguchi
496beffa88 Fix non-KNL build 2016-10-12 12:06:08 +01:00
azusayamaguchi
9b63e97108 align not absolutely required and confuses clang++ 2016-10-12 11:51:21 +01:00
azusayamaguchi
81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
paboyle
2d4a45c758 Typecast pointer 2016-10-12 09:14:15 +01:00
paboyle
a123dcd7e9 Static required for shmem. Reading same object twice requires csum reset 2016-10-12 00:29:57 +01:00
paboyle
6b27c42dfe Cosmetic 2016-10-12 00:29:39 +01:00
paboyle
f7c2aa3ba5 runtime by default 2016-10-12 00:29:13 +01:00
paboyle
7240d73184 Parallelise the x faces; fix the segv on KNL with comms 2016-10-11 22:21:07 +01:00
paboyle
42cd148f5e Base pointer for comms buffer under AVX512 assembly 2016-10-11 16:06:06 +01:00
Guido Cossu
eda4dd622e Some more edit 2016-10-11 15:45:20 +01:00
paboyle
6e01264bb7 don't use static by default 2016-10-11 10:03:39 +01:00
paboyle
6f408256bc FMA4 option moved on the align 2016-10-11 10:03:01 +01:00
paboyle
8d11681aac verbose remove 2016-10-10 23:50:42 +01:00
paboyle
3d5c9a1ee9 No compile fix on clang++ 3.9 2016-10-10 23:50:13 +01:00
paboyle
dc389e467c axpy_ssp for any coeff type via template 2016-10-10 23:48:05 +01:00
paboyle
3619167d62 Mass parameter 2016-10-10 23:47:33 +01:00
paboyle
96f1d1b828 Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass). 2016-10-10 23:46:45 +01:00
paboyle
657e0a8f4d Mass parameter 2016-10-10 23:46:10 +01:00
paboyle
616e7cd83e Mass parameter 2016-10-10 23:45:48 +01:00
paboyle
6f26d2e8d4 Overlap tree level feynman rule 2016-10-10 23:45:18 +01:00
paboyle
c014574504 A "please implement me" feynman rule. If this were abstract virtual it would
require/force implementation
2016-10-10 23:44:00 +01:00