1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-19 18:21:02 +01:00
Commit Graph

1957 Commits

Author SHA1 Message Date
azusayamaguchi b6a65059a2 Update to use shared memory to contain the stencil comms buffers
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
azusayamaguchi ea25a4d9ac Works 2016-10-23 06:10:05 +01:00
azusayamaguchi c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
azusayamaguchi 0fcd2e7188 Simplify the comms structure prior to implementing Shared memory direct bouncs 2016-10-21 22:44:10 +01:00
azusayamaguchi 910b8dd6a1 use simd type 2016-10-21 22:35:29 +01:00
azusayamaguchi 75ebd3a0d1 Typo fixes and rotate for CLANG 2016-10-21 22:34:29 +01:00
azusayamaguchi 09fd5c43a7 Reasonably fast version 2016-10-21 15:17:39 +01:00
azusayamaguchi f22317748f Merge branch 'feature/mpi3' of https://github.com/paboyle/Grid into feature/mpi3 2016-10-21 13:36:35 +01:00
azusayamaguchi 6a9eae6b6b Reporting improvements 2016-10-21 13:36:18 +01:00
azusayamaguchi fad96cf250 StencilBufs 2016-10-21 13:36:00 +01:00
azusayamaguchi f331809c27 Use variable type for loop 2016-10-21 13:35:37 +01:00
paboyle 2c54a53d0a Compile verbose reduce 2016-10-21 12:12:14 +01:00
paboyle 306160ad9a bcopy threaded 2016-10-21 12:07:28 +01:00
paboyle a762b1fb71 MPI3 working with a bounce through shared memory on my laptop.
Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node.
2016-10-21 09:03:26 +01:00
paboyle 5b5925b8e5 Forgot to add 2016-10-20 17:09:40 +01:00
paboyle b58adc6a4b commVector 2016-10-20 17:00:15 +01:00
paboyle f9d5e95d72 allocator template typedefs moved to AlignedAllocator 2016-10-20 16:59:39 +01:00
paboyle 4f8e636a43 commVector 2016-10-20 16:59:16 +01:00
paboyle 9b39f35ae6 commVector different for SHMEM compat 2016-10-20 16:58:53 +01:00
paboyle 5fe2b85cbd MPI3 and shared memory support 2016-10-20 16:58:01 +01:00
paboyle c7cccaaa69 Comm vector for shmem 2016-10-20 16:57:31 +01:00
paboyle cbcfea466f MPI3 2016-10-20 16:57:14 +01:00
paboyle 4955672fc3 MPI3 2016-10-20 16:57:00 +01:00
paboyle 39f1c880b8 mpi3 2016-10-20 16:56:40 +01:00
paboyle 8c043da5b7 SHMEM and comms allocator made different 2016-10-20 16:56:05 +01:00
paboyle 3cbe974eb4 Layout 2016-10-20 16:55:21 +01:00
paboyle 7af9b87318 Cache face tables to improve performance.
Extract merge now looking poor.
2016-10-18 09:51:37 +01:00
paboyle 811ca45473 GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support 2016-10-17 16:23:21 +01:00
paboyle bc1a4d40ba Faster integer handling avoid push_back 2016-10-17 16:16:44 +01:00
paboyle c8079e6621 Time the face gateher in x-dir more carefully 2016-10-13 22:28:50 +01:00
azusayamaguchi 8b0d171c9a 32bit issue on the KNL code variant where byte offsets were stored 2016-10-12 17:49:32 +01:00
azusayamaguchi 1f293b76b4 Merge branch 'feature/knl-stats' into develop 2016-10-12 13:47:58 +01:00
azusayamaguchi 8bbd9ebc27 Reversing changes to Stencil class 2016-10-12 13:47:20 +01:00
azusayamaguchi 6472b431f0 __rdpmc needed for gcc, clang++ 2016-10-12 12:29:08 +01:00
azusayamaguchi bd205a3293 Fixing for non x86 and non KNL 2016-10-12 12:09:15 +01:00
azusayamaguchi 496beffa88 Fix non-KNL build 2016-10-12 12:06:08 +01:00
azusayamaguchi 9b63e97108 align not absolutely required and confuses clang++ 2016-10-12 11:51:21 +01:00
azusayamaguchi 81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
paboyle 2d4a45c758 Typecast pointer 2016-10-12 09:14:15 +01:00
paboyle 0f182f033b Drop macos with gcc 2016-10-11 22:29:06 +01:00
paboyle 7240d73184 Parallelise the x faces; fix the segv on KNL with comms 2016-10-11 22:21:07 +01:00
paboyle 42cd148f5e Base pointer for comms buffer under AVX512 assembly 2016-10-11 16:06:06 +01:00
Guido Cossu 611b5d74ba Fix for AVX+FMA3 compilation 2016-10-10 15:26:17 +01:00
Guido Cossu b56c9ffa52 Fix for AVXFMA 2016-10-10 14:43:37 +01:00
portelli 70c32fa49b Merge branch 'develop' of github.com:paboyle/Grid into develop 2016-10-09 12:55:46 +01:00
portelli 77c8a94dae AVXFMA4 flag fix for Intel Compiler 2016-10-09 12:55:12 +01:00
Guido Cossu 2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle 4089984431 Timing hooks 2016-10-06 09:25:12 +01:00
portelli 98439847cf configure portability fix 2016-10-05 14:57:20 +01:00
Guido Cossu c78bbd0f8c Fix ASM compilation 2016-10-04 15:37:32 +01:00