1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-26 09:39:34 +00:00
Commit Graph

1536 Commits

Author SHA1 Message Date
paboyle
a762b1fb71 MPI3 working with a bounce through shared memory on my laptop.
Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node.
2016-10-21 09:03:26 +01:00
paboyle
5b5925b8e5 Forgot to add 2016-10-20 17:09:40 +01:00
paboyle
b58adc6a4b commVector 2016-10-20 17:00:15 +01:00
paboyle
f9d5e95d72 allocator template typedefs moved to AlignedAllocator 2016-10-20 16:59:39 +01:00
paboyle
4f8e636a43 commVector 2016-10-20 16:59:16 +01:00
paboyle
9b39f35ae6 commVector different for SHMEM compat 2016-10-20 16:58:53 +01:00
paboyle
5fe2b85cbd MPI3 and shared memory support 2016-10-20 16:58:01 +01:00
paboyle
c7cccaaa69 Comm vector for shmem 2016-10-20 16:57:31 +01:00
paboyle
cbcfea466f MPI3 2016-10-20 16:57:14 +01:00
paboyle
4955672fc3 MPI3 2016-10-20 16:57:00 +01:00
paboyle
8c043da5b7 SHMEM and comms allocator made different 2016-10-20 16:56:05 +01:00
paboyle
3cbe974eb4 Layout 2016-10-20 16:55:21 +01:00
997fd882ff Merge branch 'develop' into feature/feynman-rules
# Conflicts:
#	lib/Threads.h
#	lib/qcd/action/fermion/WilsonFermion.cc
#	lib/qcd/action/fermion/WilsonFermion.h
#	lib/qcd/utils/SUn.h
#	lib/simd/Grid_avx.h
#	lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
paboyle
7af9b87318 Cache face tables to improve performance.
Extract merge now looking poor.
2016-10-18 09:51:37 +01:00
paboyle
811ca45473 GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support 2016-10-17 16:23:21 +01:00
paboyle
bc1a4d40ba Faster integer handling avoid push_back 2016-10-17 16:16:44 +01:00
paboyle
c8079e6621 Time the face gateher in x-dir more carefully 2016-10-13 22:28:50 +01:00
azusayamaguchi
8b0d171c9a 32bit issue on the KNL code variant where byte offsets were stored 2016-10-12 17:49:32 +01:00
azusayamaguchi
8bbd9ebc27 Reversing changes to Stencil class 2016-10-12 13:47:20 +01:00
azusayamaguchi
6472b431f0 __rdpmc needed for gcc, clang++ 2016-10-12 12:29:08 +01:00
azusayamaguchi
bd205a3293 Fixing for non x86 and non KNL 2016-10-12 12:09:15 +01:00
azusayamaguchi
496beffa88 Fix non-KNL build 2016-10-12 12:06:08 +01:00
azusayamaguchi
9b63e97108 align not absolutely required and confuses clang++ 2016-10-12 11:51:21 +01:00
azusayamaguchi
81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
paboyle
2d4a45c758 Typecast pointer 2016-10-12 09:14:15 +01:00
paboyle
a123dcd7e9 Static required for shmem. Reading same object twice requires csum reset 2016-10-12 00:29:57 +01:00
paboyle
6b27c42dfe Cosmetic 2016-10-12 00:29:39 +01:00
paboyle
f7c2aa3ba5 runtime by default 2016-10-12 00:29:13 +01:00
paboyle
7240d73184 Parallelise the x faces; fix the segv on KNL with comms 2016-10-11 22:21:07 +01:00
paboyle
42cd148f5e Base pointer for comms buffer under AVX512 assembly 2016-10-11 16:06:06 +01:00
paboyle
6e01264bb7 don't use static by default 2016-10-11 10:03:39 +01:00
paboyle
6f408256bc FMA4 option moved on the align 2016-10-11 10:03:01 +01:00
paboyle
8d11681aac verbose remove 2016-10-10 23:50:42 +01:00
paboyle
3d5c9a1ee9 No compile fix on clang++ 3.9 2016-10-10 23:50:13 +01:00
paboyle
dc389e467c axpy_ssp for any coeff type via template 2016-10-10 23:48:05 +01:00
paboyle
3619167d62 Mass parameter 2016-10-10 23:47:33 +01:00
paboyle
96f1d1b828 Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass). 2016-10-10 23:46:45 +01:00
paboyle
657e0a8f4d Mass parameter 2016-10-10 23:46:10 +01:00
paboyle
616e7cd83e Mass parameter 2016-10-10 23:45:48 +01:00
paboyle
6f26d2e8d4 Overlap tree level feynman rule 2016-10-10 23:45:18 +01:00
paboyle
c014574504 A "please implement me" feynman rule. If this were abstract virtual it would
require/force implementation
2016-10-10 23:44:00 +01:00
paboyle
d7ce164e6e Feynman rule for DWF 2016-10-10 23:43:36 +01:00
paboyle
c0d5b99016 Dminus 2016-10-10 23:43:19 +01:00
paboyle
09ca32d678 Dminus added for Cayley 2016-10-10 23:42:55 +01:00
paboyle
082ae350c6 static schedule by default 2016-10-10 23:42:30 +01:00
Guido Cossu
611b5d74ba Fix for AVX+FMA3 compilation 2016-10-10 15:26:17 +01:00
Guido Cossu
b56c9ffa52 Fix for AVXFMA 2016-10-10 14:43:37 +01:00
cb02b7088f Merge branch 'develop' into feature/doxygen
# Conflicts:
#	configure.ac
2016-10-09 13:35:44 +01:00
Guido Cossu
2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle
4089984431 Timing hooks 2016-10-06 09:25:12 +01:00