1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 01:35:36 +00:00
Commit Graph

43 Commits

Author SHA1 Message Date
Peter Boyle
ff2f559a57 Remove inline on gather optimised path 2016-12-27 17:45:19 +00:00
Peter Boyle
3d21297bbb Call the fast path compressor for wilson kernels to avoid if else on projector 2016-12-27 11:23:13 +00:00
Peter Boyle
83fa038bdf Streaming stores 2016-12-08 16:58:42 +00:00
azusayamaguchi
d7d92af09d Travis fail fix attempt 2016-10-25 01:45:53 +01:00
azusayamaguchi
b94478fa51 mpi, mpi3, shmem all compile.
mpi, mpi3 pass single node multi-rank
2016-10-24 23:45:31 +01:00
azusayamaguchi
b6a65059a2 Update to use shared memory to contain the stencil comms buffers
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
azusayamaguchi
ea25a4d9ac Works 2016-10-23 06:10:05 +01:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
azusayamaguchi
0fcd2e7188 Simplify the comms structure prior to implementing Shared memory direct bouncs 2016-10-21 22:44:10 +01:00
paboyle
c7cccaaa69 Comm vector for shmem 2016-10-20 16:57:31 +01:00
paboyle
7af9b87318 Cache face tables to improve performance.
Extract merge now looking poor.
2016-10-18 09:51:37 +01:00
paboyle
bc1a4d40ba Faster integer handling avoid push_back 2016-10-17 16:16:44 +01:00
paboyle
c8079e6621 Time the face gateher in x-dir more carefully 2016-10-13 22:28:50 +01:00
azusayamaguchi
8b0d171c9a 32bit issue on the KNL code variant where byte offsets were stored 2016-10-12 17:49:32 +01:00
azusayamaguchi
8bbd9ebc27 Reversing changes to Stencil class 2016-10-12 13:47:20 +01:00
paboyle
2d4a45c758 Typecast pointer 2016-10-12 09:14:15 +01:00
paboyle
42cd148f5e Base pointer for comms buffer under AVX512 assembly 2016-10-11 16:06:06 +01:00
Guido Cossu
2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle
4089984431 Timing hooks 2016-10-06 09:25:12 +01:00
Antonin Portelli
0724f7af75 QPX single precision implementation 2016-09-19 18:09:12 +01:00
paboyle
a0676beeb1 Open up dependency on Eigen and FFTW 2016-07-07 22:31:07 +01:00
paboyle
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
paboyle
1e554350ac The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug. 2016-04-29 16:49:18 -07:00
paboyle
b27bac4669 Updates for simd in one dir 2016-04-19 15:34:10 -07:00
Peter Boyle
c9fadf97a5 Simplify the compressor interface again. 2016-02-17 18:16:45 -06:00
Peter Boyle
81395e85d1 Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it. 2016-02-16 13:56:44 -06:00
Peter Boyle
340a29b735 More careful sequencing of comms 2016-02-15 16:04:59 -06:00
Peter Boyle
42a9ac71d2 BUg fix, wait till complete. 2016-02-14 16:21:21 -06:00
Peter Boyle
41c2b09184 Shmem comms [NO MPI] target added. The dwf test runs and passes.
Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working.
But committing my current while I try a few experimentals.
2016-02-14 14:24:38 -06:00
Peter Boyle
7f927a541c Shmem related fixes for shmem compile 2016-02-11 07:37:39 -06:00
paboyle
fc6ad65751 Pushed the overlap comms tweaks 2016-01-11 06:34:22 -08:00
paboyle
dafc74020c Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori 2016-01-10 16:54:27 -08:00
paboyle
d19321dfde Overlap comms compute changes 2016-01-10 19:20:16 +00:00
paboyle
c99d748da6 Timing reports in benchmarks now reflect the asynch comms thread statistics 2016-01-04 14:42:16 +00:00
paboyle
331768dcff Added overlap comms compute mode 2016-01-03 01:38:11 +00:00
paboyle
aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
paboyle
145a295231 Bug fix for stencil with large shifts (3+), would be important to naik term for example but did not
impact Wilson based nearest neighbour stencils.
2015-12-30 19:29:48 +00:00
Peter Boyle
880ff88362 Comms optimisation 2015-11-06 05:22:18 -06:00
Peter Boyle
ec5af35166 EO bug fix when spread out in x-direction 2015-11-04 09:56:58 +00:00
Peter Boyle
7d3512ab21 Gparity valence test now working.
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
1d0df449e8 Reorganise of file naming 2015-06-03 12:47:05 +01:00