1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-09 21:50:45 +01:00

53 Commits

Author SHA1 Message Date
Peter Boyle
3dbfce5223 Tests clean build on HIP 2022-11-16 20:15:51 -05:00
Peter Boyle
e51eaedc56 Making tests compile 2022-11-15 22:58:30 -05:00
Peter Boyle
6209120de9 Fix to GPU compile attempt 2022-11-15 17:25:58 -05:00
Peter Boyle
e2e269e03b Partial dirichlet BCs 2022-11-15 16:24:26 -05:00
Peter Boyle
551a5f8dc8 RRII gpu option 2022-10-11 14:44:55 -04:00
Peter Boyle
5abb19eab0 Remove self timing 2022-08-31 17:32:49 -04:00
Peter Boyle
75bb6b2b40 Move barrier into the StencilSend begin routine 2022-08-04 13:35:26 -04:00
Peter Boyle
a93d5459d4 Better mpi request completion 2022-07-28 12:18:35 -04:00
Peter Boyle
58182fe345 Different approach to default dirichlet params 2022-07-10 21:32:58 +01:00
Peter Boyle
e762c940c2 Reduce the loop over exterior for GPU to indirection table 2022-06-01 14:29:25 -07:00
Peter Boyle
34faa39f4f Clean up Dirichlet. Big oops fix 2022-05-28 17:18:08 -07:00
Peter Boyle
4f997c5f04 Remove extra face kernels in Dirichlet 2022-05-25 11:15:25 -07:00
Peter Boyle
e651b9e7ab Clean up stencil with better intranode Dirichlet / DDHMC support.
14TF/s on a Perlmutter node
2022-05-24 18:23:39 -07:00
Peter Boyle
f82ce67624 Dirichlet improved 2022-05-19 19:17:11 -07:00
Peter Boyle
5340e50427 HMC running with new formulation 2022-03-01 17:10:25 -05:00
Peter Boyle
0f1c5b08a1 Dirichlet filters running on AMD and now integrated in Fermion op 2022-02-23 19:29:28 -05:00
Peter Boyle
aab3bcb46f Dirichlet first cut - wrong answers on dagger multiply.
Struggling to get a compute node so changing systems
2022-02-22 19:58:33 +00:00
Peter Boyle
e8b1251b8c Staggered fix finished 2022-02-17 04:51:13 +00:00
Peter Boyle
fad5a74a4b Bug fix to detection case 2022-02-15 10:27:39 -05:00
Azusa Yamaguchi
6283d11d50 Add the comment line to tell the existance of copied data/buffer 2022-02-08 15:22:06 +00:00
Peter Boyle
6616d5d090 Commit 2022-02-02 16:38:24 -05:00
Peter Boyle
ba7e371b90 Warning free compile on Tursa.
Hopefully got all reqd virtual dtors
2021-10-21 19:56:52 +01:00
Peter Boyle
894654f7ef Simplificatoin, always gather faces 2021-09-21 01:02:34 +02:00
Peter Boyle
86e33c8ab2 Significant GPU perf speed up finished 2021-09-14 16:14:23 +01:00
u61464
679d1d22f7 Sycl happier 2021-03-03 11:21:43 -08:00
Peter Boyle
d4861a362c Stencil use non-UVM memory for look up table on enable-shared=no 2020-11-23 15:38:49 +00:00
nmeyer-ur
8726e94ea7 merge upstream develop 2020-07-07 20:26:47 +02:00
nmeyer-ur
433766ac62 revert Add/SubTimesI and prefetching in stencil
This reverts commit 9b2699226c7a3ca8d45f843f4f8e4658fa082163.
2020-06-08 12:02:53 +02:00
nmeyer-ur
93a37c8f68 test prefetch to L2 in stencil 2020-06-08 09:39:50 +02:00
Peter Boyle
1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
Peter Boyle
0c3112cd94 Use view mechanism 2020-06-03 09:11:51 -04:00
nmeyer-ur
5ee3ea2144 round-up after testing of prefetches in stencil close 2020-06-03 11:58:20 +02:00
nmeyer-ur
e947b563ea add space in stencil output 2020-05-29 17:11:17 +02:00
Peter Boyle
949ac3cd24 Must avoid non-trivial copy constructors 2020-05-25 08:35:28 -07:00
Peter Boyle
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle
28a1fcaaff First compile against SYCL 2020-05-05 11:13:27 -07:00
Peter Boyle
d1a89af8c9 Change to reporting 2019-11-22 10:49:10 -05:00
Peter Boyle
705a8098b2 Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
Conflicts:
	Grid/stencil/Stencil.h
2019-07-12 17:14:11 +01:00
Peter Boyle
a29b43d755 Stencil comms cleaner 2019-07-12 17:12:25 +01:00
Peter Boyle
3d58daf70f Safety check 2019-07-12 17:10:35 +01:00
Peter Boyle
91e2cf9b40 All axes can be used for comms now 2019-07-12 09:08:26 +01:00
Peter Boyle
6e3c3214a3 Offload loops 2019-07-02 17:25:40 +01:00
Peter Boyle
7b7c470917 Accelerator loop 2019-07-01 07:29:51 +01:00
Peter Boyle
1e889c93b8 Insert a GPU synchronise 2019-06-15 08:23:26 +01:00
Peter Boyle
cefaacbc07 Changing accelerator loop. Still have work to do for multi-GPU code 2019-06-15 08:10:24 +01:00
Peter Boyle
8eea568426 GPU loop ; presently differentiated with ifdef, find a way to unify. 2019-06-05 00:09:28 +01:00
Peter Boyle
24bff6dbe6 Minor improvements 2019-06-04 20:51:48 +01:00
Peter Boyle
aca788cf4f Move coalesced read into tensors 2019-05-25 12:43:00 +01:00
Peter Boyle
6c4da3bbc7 Stencil now runs with coalesced accesses 2019-05-18 17:40:35 +01:00
Peter Boyle
f9b8c0cccf Vector changes for UVM 2019-04-28 07:38:57 +01:00