1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-12 23:20:45 +01:00

23 Commits

Author SHA1 Message Date
Christoph Lehner
2bb374daea hip-friendly 2021-03-19 11:33:23 +01:00
Christoph Lehner
f0dc0f3621 fix compile issue on Qpace3 2020-08-22 13:57:33 +02:00
Christoph Lehner
dbaa24ebf6 further GPU memory access fixes (with this GPT passes all single-rank tests on non-summit GPUs) 2020-08-13 16:14:15 +02:00
Peter Boyle
b949cf6b12 PeekLocal needs a view to keep thread safe.
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
Christoph Lehner
b5e87e8d97 summit compile fixes 2020-06-12 18:16:12 -04:00
Peter Boyle
a7ffc61e82 acceleratorSIMTlane() 2020-06-10 19:58:33 -04:00
Peter Boyle
cdf0a04fc5 Merge branch 'develop' into sycl 2020-06-09 04:00:12 -04:00
Peter Boyle
1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
Peter Boyle
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Christoph Lehner
e9b295f967 Synchronize blocking infrastructure with GPT 2020-05-06 08:42:28 -04:00
Peter Boyle
6cdb09c884 Faster copy region 2020-04-10 11:10:52 -04:00
Peter Boyle
68b45f6444 Lower left/upper right region cut paste 2020-02-06 15:50:26 -05:00
Peter Boyle
1bd87c35d7 Read coalescing on Nvidia 2020-01-27 12:29:56 -05:00
Peter Boyle
9aafd20468 Simple block project promote runs faster on GPU 2019-12-17 05:01:39 -05:00
Peter Boyle
9e15474999 Accelerator loop attempt at speed up 2019-12-14 05:28:16 -05:00
Peter Boyle
152b525a4d Typo fix 2019-12-13 22:44:42 -05:00
Peter Boyle
d18994eddc offload more of mgrid to GPU 2019-12-13 22:08:11 -05:00
Peter Boyle
6b692aa726 Thread loops 2019-06-15 08:02:26 +01:00
Peter Boyle
715babeac8 GPU reductions first cut; use thrust, non-reproducible. Inclusive scan can fix this if desired.
Local reduction to LatticeComplex and then further reduction.
2019-01-01 13:53:37 +00:00
Peter Boyle
b57a4d32aa Merge branch 'develop' into feature/gpu-port 2018-12-13 05:11:34 +00:00
Peter Boyle
33a0bbb17b Const correctness 2018-11-19 11:27:57 +00:00
Peter Boyle
e9b6f58fdc Allow shrinking machine in orthog direction for extract slice local 2018-11-07 23:39:18 +00:00
fb7d021b9d Hadrons: moving Hadrons to root directory, build system improvements 2018-08-28 15:00:40 +01:00