Peter Boyle
403bff1a47
Force reqd subgroup size fo SYCL
2021-06-22 17:56:10 +00:00
Peter Boyle
0e27e3847d
Remove synch
2021-06-03 04:24:19 +00:00
Peter Boyle
4d1ea15c79
More verbosity. The 16bit limit on Grid.y, Grid.z is annoying
2021-03-09 04:29:37 +01:00
u61464
679d1d22f7
Sycl happier
2021-03-03 11:21:43 -08:00
Peter Boyle
f9b1f240f6
Better SIMD usage/coalescence
2021-02-26 17:51:41 +01:00
Peter Boyle
eda9ab487b
MADWF 5d source option for hadrons - look at Grid of source
...
Abort on GPU error
2021-02-08 10:47:22 -05:00
Peter Boyle
86e8b9fe38
ALLOC_ALIGN removed
2020-11-20 17:07:16 +01:00
Peter Boyle
9c4dcc5ea3
Merge branch 'master' into develop
2020-11-16 16:34:57 +01:00
Peter Boyle
cf23eff60e
Device to Device, Memset, cannot assume UVM == Communicable
2020-11-13 03:51:08 +01:00
Peter Boyle
00d0d6d008
Hip Free managed
2020-10-31 18:14:31 -04:00
Peter Boyle
ecd3f890f5
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-09-16 02:30:14 +01:00
Peter Boyle
dacbbdd051
Hip Happy Birthday
2020-09-16 00:37:02 +01:00
Peter Boyle
a8309638d4
UVM check in MPI calls
2020-09-03 20:29:26 -04:00
Peter Boyle
bcd7895362
Include cuda.h
2020-09-03 15:49:13 -04:00
Peter Boyle
6c5fa8dcd8
Aligned allocate on CPU put through this interface
2020-06-20 14:34:29 -04:00
Peter Boyle
0d2f913a1a
String.h for linux
2020-06-20 09:37:31 -04:00
Peter Boyle
ee63721bad
int unhappiness sycl fix
2020-05-25 08:36:24 -07:00
Peter Boyle
32be2b13d3
Updates for HiP
2020-05-24 14:00:55 -04:00
Peter Boyle
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
Peter Boyle
d24d8e8398
Use X-direction as more bits meaningful on CUDA.
...
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume
e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.
Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
Peter Boyle
07c0c02f8c
Speed up Cshift
2020-05-11 17:02:01 -04:00
Peter Boyle
bbbee5660d
First compiile on HiP
2020-05-10 05:28:09 -04:00
Peter Boyle
52081acfa5
NVCC compile fixes
2020-05-08 13:14:12 -04:00
Peter Boyle
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00