1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 09:45:36 +00:00
Commit Graph

1476 Commits

Author SHA1 Message Date
Guido Cossu
e1042aef77 First version of the doube prec for testing purposes
It does not compile single and double version at the same time
2016-10-28 17:20:04 +01:00
paboyle
aa6a839c60 avx512 build fix; detect clang/gcc intrinsics vs. ICPC 2016-10-28 09:13:09 +01:00
b4d2af8c89 threaded FFT 2016-10-26 19:46:36 +01:00
434af6aeaa Merge branch 'develop' into feature/fft-opt 2016-10-26 18:50:38 +01:00
e90f8ac841 Merge branch 'develop' into feature/feynman-rules 2016-10-26 18:50:21 +01:00
a1705a8d53 debug message removed 2016-10-26 18:50:07 +01:00
ca21003f01 Merge branch 'feature/fft-opt' into feature/feynman-rules
# Conflicts:
#	lib/FFT.h
#	lib/qcd/action/fermion/WilsonFermion5D.h
#	tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
14ddf2c234 more FFT optimisations 2016-10-26 17:36:26 +01:00
Azusa Yamaguchi
bca861e112 Note:FFT shoud be GridFFT (Not change yet).
Gauge fix with FFt is added (tests/core)
2016-10-25 14:21:48 +01:00
33d199a0ad temporary thread safety in FFT 2016-10-25 12:56:40 +01:00
paboyle
b820076b91 Merge branch 'develop' into feature/mpi3 2016-10-25 06:02:33 +01:00
paboyle
09f66100d3 MPI 3 compile on non-linux 2016-10-25 06:01:12 +01:00
azusayamaguchi
d7d92af09d Travis fail fix attempt 2016-10-25 01:45:53 +01:00
azusayamaguchi
460d0753a1 Merge branch 'develop' into feature/mpi3
Conflicts:
	lib/simd/Grid_avx512.h
2016-10-25 01:08:51 +01:00
azusayamaguchi
8f8058f8a5 More random bits on parallel seeding 2016-10-25 01:05:52 +01:00
azusayamaguchi
d97a27f483 Verbose 2016-10-25 01:05:31 +01:00
azusayamaguchi
7c3363b91e Compiles all comms targets 2016-10-25 00:04:17 +01:00
azusayamaguchi
b94478fa51 mpi, mpi3, shmem all compile.
mpi, mpi3 pass single node multi-rank
2016-10-24 23:45:31 +01:00
13bf0482e3 FFT optimisation 2016-10-24 19:25:40 +01:00
a795b5705e memory optimisation 2016-10-24 19:25:15 +01:00
392e064513 fast local peek-poke 2016-10-24 19:24:21 +01:00
azusayamaguchi
b6a65059a2 Update to use shared memory to contain the stencil comms buffers
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
azusayamaguchi
ea25a4d9ac Works 2016-10-23 06:10:05 +01:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
azusayamaguchi
0fcd2e7188 Simplify the comms structure prior to implementing Shared memory direct bouncs 2016-10-21 22:44:10 +01:00
azusayamaguchi
910b8dd6a1 use simd type 2016-10-21 22:35:29 +01:00
azusayamaguchi
75ebd3a0d1 Typo fixes and rotate for CLANG 2016-10-21 22:34:29 +01:00
7c8f79b147 more stochastic QED fixes 2016-10-21 15:20:12 +01:00
azusayamaguchi
09fd5c43a7 Reasonably fast version 2016-10-21 15:17:39 +01:00
462921e549 QED: fix stochastic field 2016-10-21 14:41:08 +01:00
azusayamaguchi
f22317748f Merge branch 'feature/mpi3' of https://github.com/paboyle/Grid into feature/mpi3 2016-10-21 13:36:35 +01:00
azusayamaguchi
6a9eae6b6b Reporting improvements 2016-10-21 13:36:18 +01:00
azusayamaguchi
fad96cf250 StencilBufs 2016-10-21 13:36:00 +01:00
azusayamaguchi
f331809c27 Use variable type for loop 2016-10-21 13:35:37 +01:00
bd6a228af6 Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules 2016-10-21 13:10:30 +01:00
63d219498b first (dirty) implementation of Feynman stoctachtic EM field 2016-10-21 13:10:13 +01:00
paboyle
2c54a53d0a Compile verbose reduce 2016-10-21 12:12:14 +01:00
paboyle
306160ad9a bcopy threaded 2016-10-21 12:07:28 +01:00
azusayamaguchi
20a091c3ed Intel vs. Clang intrinsics differences absorbed 2016-10-21 09:08:36 +01:00
azusayamaguchi
202078eb1b Cray / OpenSHMEM ordering differs 2016-10-21 09:07:20 +01:00
paboyle
a762b1fb71 MPI3 working with a bounce through shared memory on my laptop.
Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node.
2016-10-21 09:03:26 +01:00
paboyle
5b5925b8e5 Forgot to add 2016-10-20 17:09:40 +01:00
paboyle
b58adc6a4b commVector 2016-10-20 17:00:15 +01:00
paboyle
f9d5e95d72 allocator template typedefs moved to AlignedAllocator 2016-10-20 16:59:39 +01:00
paboyle
4f8e636a43 commVector 2016-10-20 16:59:16 +01:00
paboyle
9b39f35ae6 commVector different for SHMEM compat 2016-10-20 16:58:53 +01:00
paboyle
5fe2b85cbd MPI3 and shared memory support 2016-10-20 16:58:01 +01:00
paboyle
c7cccaaa69 Comm vector for shmem 2016-10-20 16:57:31 +01:00
paboyle
cbcfea466f MPI3 2016-10-20 16:57:14 +01:00
paboyle
4955672fc3 MPI3 2016-10-20 16:57:00 +01:00
paboyle
8c043da5b7 SHMEM and comms allocator made different 2016-10-20 16:56:05 +01:00
paboyle
3cbe974eb4 Layout 2016-10-20 16:55:21 +01:00
997fd882ff Merge branch 'develop' into feature/feynman-rules
# Conflicts:
#	lib/Threads.h
#	lib/qcd/action/fermion/WilsonFermion.cc
#	lib/qcd/action/fermion/WilsonFermion.h
#	lib/qcd/utils/SUn.h
#	lib/simd/Grid_avx.h
#	lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
paboyle
7af9b87318 Cache face tables to improve performance.
Extract merge now looking poor.
2016-10-18 09:51:37 +01:00
paboyle
811ca45473 GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support 2016-10-17 16:23:21 +01:00
paboyle
bc1a4d40ba Faster integer handling avoid push_back 2016-10-17 16:16:44 +01:00
paboyle
c8079e6621 Time the face gateher in x-dir more carefully 2016-10-13 22:28:50 +01:00
azusayamaguchi
8b0d171c9a 32bit issue on the KNL code variant where byte offsets were stored 2016-10-12 17:49:32 +01:00
azusayamaguchi
8bbd9ebc27 Reversing changes to Stencil class 2016-10-12 13:47:20 +01:00
azusayamaguchi
6472b431f0 __rdpmc needed for gcc, clang++ 2016-10-12 12:29:08 +01:00
azusayamaguchi
bd205a3293 Fixing for non x86 and non KNL 2016-10-12 12:09:15 +01:00
azusayamaguchi
496beffa88 Fix non-KNL build 2016-10-12 12:06:08 +01:00
azusayamaguchi
9b63e97108 align not absolutely required and confuses clang++ 2016-10-12 11:51:21 +01:00
azusayamaguchi
81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
paboyle
2d4a45c758 Typecast pointer 2016-10-12 09:14:15 +01:00
paboyle
a123dcd7e9 Static required for shmem. Reading same object twice requires csum reset 2016-10-12 00:29:57 +01:00
paboyle
6b27c42dfe Cosmetic 2016-10-12 00:29:39 +01:00
paboyle
f7c2aa3ba5 runtime by default 2016-10-12 00:29:13 +01:00
paboyle
7240d73184 Parallelise the x faces; fix the segv on KNL with comms 2016-10-11 22:21:07 +01:00
paboyle
42cd148f5e Base pointer for comms buffer under AVX512 assembly 2016-10-11 16:06:06 +01:00
paboyle
6e01264bb7 don't use static by default 2016-10-11 10:03:39 +01:00
paboyle
6f408256bc FMA4 option moved on the align 2016-10-11 10:03:01 +01:00
paboyle
8d11681aac verbose remove 2016-10-10 23:50:42 +01:00
paboyle
3d5c9a1ee9 No compile fix on clang++ 3.9 2016-10-10 23:50:13 +01:00
paboyle
dc389e467c axpy_ssp for any coeff type via template 2016-10-10 23:48:05 +01:00
paboyle
3619167d62 Mass parameter 2016-10-10 23:47:33 +01:00
paboyle
96f1d1b828 Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass). 2016-10-10 23:46:45 +01:00
paboyle
657e0a8f4d Mass parameter 2016-10-10 23:46:10 +01:00
paboyle
616e7cd83e Mass parameter 2016-10-10 23:45:48 +01:00
paboyle
6f26d2e8d4 Overlap tree level feynman rule 2016-10-10 23:45:18 +01:00
paboyle
c014574504 A "please implement me" feynman rule. If this were abstract virtual it would
require/force implementation
2016-10-10 23:44:00 +01:00
paboyle
d7ce164e6e Feynman rule for DWF 2016-10-10 23:43:36 +01:00
paboyle
c0d5b99016 Dminus 2016-10-10 23:43:19 +01:00
paboyle
09ca32d678 Dminus added for Cayley 2016-10-10 23:42:55 +01:00
paboyle
082ae350c6 static schedule by default 2016-10-10 23:42:30 +01:00
Guido Cossu
611b5d74ba Fix for AVX+FMA3 compilation 2016-10-10 15:26:17 +01:00
Guido Cossu
b56c9ffa52 Fix for AVXFMA 2016-10-10 14:43:37 +01:00
cb02b7088f Merge branch 'develop' into feature/doxygen
# Conflicts:
#	configure.ac
2016-10-09 13:35:44 +01:00
Guido Cossu
2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle
4089984431 Timing hooks 2016-10-06 09:25:12 +01:00
Guido Cossu
c78bbd0f8c Fix ASM compilation 2016-10-04 15:37:32 +01:00
536e2ff073 *.inc removed: please don't commit these files either! 2016-09-27 11:54:03 +01:00
paboyle
87acd06990 Use streaming stores 2016-09-26 10:11:34 +01:00
paboyle
9353b6edfe Fenv out of grid namespace 2016-09-26 10:09:13 +01:00
paboyle
167cc2650e GNU SOURCE problem on travis 2016-09-26 09:58:09 +01:00
paboyle
7089b6d5a5 Setting up but not implemented some QED rules 2016-09-26 09:43:40 +01:00
paboyle
2ba7d43ddd Divide handling 2016-09-26 09:43:14 +01:00
paboyle
836e929565 Divide handling improved 2016-09-26 09:42:22 +01:00
paboyle
b6713ecb60 Momentum space rules for Overlap, DWF untested to date 2016-09-26 09:39:09 +01:00
paboyle
52a39f0fcd Divide in ET 2016-09-26 09:38:38 +01:00