Peter Boyle
|
3dbfce5223
|
Tests clean build on HIP
|
2022-11-16 20:15:51 -05:00 |
|
Peter Boyle
|
e51eaedc56
|
Making tests compile
|
2022-11-15 22:58:30 -05:00 |
|
Peter Boyle
|
6209120de9
|
Fix to GPU compile attempt
|
2022-11-15 17:25:58 -05:00 |
|
Peter Boyle
|
e2e269e03b
|
Partial dirichlet BCs
|
2022-11-15 16:24:26 -05:00 |
|
Peter Boyle
|
551a5f8dc8
|
RRII gpu option
|
2022-10-11 14:44:55 -04:00 |
|
Peter Boyle
|
5abb19eab0
|
Remove self timing
|
2022-08-31 17:32:49 -04:00 |
|
Peter Boyle
|
75bb6b2b40
|
Move barrier into the StencilSend begin routine
|
2022-08-04 13:35:26 -04:00 |
|
Peter Boyle
|
a93d5459d4
|
Better mpi request completion
|
2022-07-28 12:18:35 -04:00 |
|
Peter Boyle
|
58182fe345
|
Different approach to default dirichlet params
|
2022-07-10 21:32:58 +01:00 |
|
Peter Boyle
|
e762c940c2
|
Reduce the loop over exterior for GPU to indirection table
|
2022-06-01 14:29:25 -07:00 |
|
Peter Boyle
|
34faa39f4f
|
Clean up Dirichlet. Big oops fix
|
2022-05-28 17:18:08 -07:00 |
|
Peter Boyle
|
4f997c5f04
|
Remove extra face kernels in Dirichlet
|
2022-05-25 11:15:25 -07:00 |
|
Peter Boyle
|
e651b9e7ab
|
Clean up stencil with better intranode Dirichlet / DDHMC support.
14TF/s on a Perlmutter node
|
2022-05-24 18:23:39 -07:00 |
|
Peter Boyle
|
f82ce67624
|
Dirichlet improved
|
2022-05-19 19:17:11 -07:00 |
|
Peter Boyle
|
5340e50427
|
HMC running with new formulation
|
2022-03-01 17:10:25 -05:00 |
|
Peter Boyle
|
0f1c5b08a1
|
Dirichlet filters running on AMD and now integrated in Fermion op
|
2022-02-23 19:29:28 -05:00 |
|
Peter Boyle
|
aab3bcb46f
|
Dirichlet first cut - wrong answers on dagger multiply.
Struggling to get a compute node so changing systems
|
2022-02-22 19:58:33 +00:00 |
|
Peter Boyle
|
e8b1251b8c
|
Staggered fix finished
|
2022-02-17 04:51:13 +00:00 |
|
Peter Boyle
|
fad5a74a4b
|
Bug fix to detection case
|
2022-02-15 10:27:39 -05:00 |
|
Azusa Yamaguchi
|
6283d11d50
|
Add the comment line to tell the existance of copied data/buffer
|
2022-02-08 15:22:06 +00:00 |
|
Peter Boyle
|
6616d5d090
|
Commit
|
2022-02-02 16:38:24 -05:00 |
|
Peter Boyle
|
ba7e371b90
|
Warning free compile on Tursa.
Hopefully got all reqd virtual dtors
|
2021-10-21 19:56:52 +01:00 |
|
Peter Boyle
|
894654f7ef
|
Simplificatoin, always gather faces
|
2021-09-21 01:02:34 +02:00 |
|
Peter Boyle
|
86e33c8ab2
|
Significant GPU perf speed up finished
|
2021-09-14 16:14:23 +01:00 |
|
u61464
|
679d1d22f7
|
Sycl happier
|
2021-03-03 11:21:43 -08:00 |
|
Peter Boyle
|
d4861a362c
|
Stencil use non-UVM memory for look up table on enable-shared=no
|
2020-11-23 15:38:49 +00:00 |
|
nmeyer-ur
|
8726e94ea7
|
merge upstream develop
|
2020-07-07 20:26:47 +02:00 |
|
nmeyer-ur
|
433766ac62
|
revert Add/SubTimesI and prefetching in stencil
This reverts commit 9b2699226c7a3ca8d45f843f4f8e4658fa082163.
|
2020-06-08 12:02:53 +02:00 |
|
nmeyer-ur
|
93a37c8f68
|
test prefetch to L2 in stencil
|
2020-06-08 09:39:50 +02:00 |
|
Peter Boyle
|
1a4c8c3387
|
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
|
2020-06-05 18:52:35 -04:00 |
|
Peter Boyle
|
0c3112cd94
|
Use view mechanism
|
2020-06-03 09:11:51 -04:00 |
|
nmeyer-ur
|
5ee3ea2144
|
round-up after testing of prefetches in stencil close
|
2020-06-03 11:58:20 +02:00 |
|
nmeyer-ur
|
e947b563ea
|
add space in stencil output
|
2020-05-29 17:11:17 +02:00 |
|
Peter Boyle
|
949ac3cd24
|
Must avoid non-trivial copy constructors
|
2020-05-25 08:35:28 -07:00 |
|
Peter Boyle
|
7860a50f70
|
Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
|
2020-05-21 16:13:16 -04:00 |
|
Peter Boyle
|
28a1fcaaff
|
First compile against SYCL
|
2020-05-05 11:13:27 -07:00 |
|
Peter Boyle
|
d1a89af8c9
|
Change to reporting
|
2019-11-22 10:49:10 -05:00 |
|
Peter Boyle
|
705a8098b2
|
Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
Conflicts:
Grid/stencil/Stencil.h
|
2019-07-12 17:14:11 +01:00 |
|
Peter Boyle
|
a29b43d755
|
Stencil comms cleaner
|
2019-07-12 17:12:25 +01:00 |
|
Peter Boyle
|
3d58daf70f
|
Safety check
|
2019-07-12 17:10:35 +01:00 |
|
Peter Boyle
|
91e2cf9b40
|
All axes can be used for comms now
|
2019-07-12 09:08:26 +01:00 |
|
Peter Boyle
|
6e3c3214a3
|
Offload loops
|
2019-07-02 17:25:40 +01:00 |
|
Peter Boyle
|
7b7c470917
|
Accelerator loop
|
2019-07-01 07:29:51 +01:00 |
|
Peter Boyle
|
1e889c93b8
|
Insert a GPU synchronise
|
2019-06-15 08:23:26 +01:00 |
|
Peter Boyle
|
cefaacbc07
|
Changing accelerator loop. Still have work to do for multi-GPU code
|
2019-06-15 08:10:24 +01:00 |
|
Peter Boyle
|
8eea568426
|
GPU loop ; presently differentiated with ifdef, find a way to unify.
|
2019-06-05 00:09:28 +01:00 |
|
Peter Boyle
|
24bff6dbe6
|
Minor improvements
|
2019-06-04 20:51:48 +01:00 |
|
Peter Boyle
|
aca788cf4f
|
Move coalesced read into tensors
|
2019-05-25 12:43:00 +01:00 |
|
Peter Boyle
|
6c4da3bbc7
|
Stencil now runs with coalesced accesses
|
2019-05-18 17:40:35 +01:00 |
|
Peter Boyle
|
f9b8c0cccf
|
Vector changes for UVM
|
2019-04-28 07:38:57 +01:00 |
|