1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-26 21:55:55 +01:00

75 Commits

Author SHA1 Message Date
Peter Boyle
403bff1a47 Force reqd subgroup size fo SYCL 2021-06-22 17:56:10 +00:00
Peter Boyle
0e27e3847d Remove synch 2021-06-03 04:24:19 +00:00
Peter Boyle
4d1ea15c79 More verbosity. The 16bit limit on Grid.y, Grid.z is annoying 2021-03-09 04:29:37 +01:00
u61464
679d1d22f7 Sycl happier 2021-03-03 11:21:43 -08:00
Peter Boyle
f9b1f240f6 Better SIMD usage/coalescence 2021-02-26 17:51:41 +01:00
Peter Boyle
eda9ab487b MADWF 5d source option for hadrons - look at Grid of source
Abort on GPU error
2021-02-08 10:47:22 -05:00
Peter Boyle
86e8b9fe38 ALLOC_ALIGN removed 2020-11-20 17:07:16 +01:00
Peter Boyle
9c4dcc5ea3 Merge branch 'master' into develop 2020-11-16 16:34:57 +01:00
Peter Boyle
cf23eff60e Device to Device, Memset, cannot assume UVM == Communicable 2020-11-13 03:51:08 +01:00
Peter Boyle
00d0d6d008 Hip Free managed 2020-10-31 18:14:31 -04:00
Peter Boyle
ecd3f890f5 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2020-09-16 02:30:14 +01:00
Peter Boyle
dacbbdd051 Hip Happy Birthday 2020-09-16 00:37:02 +01:00
Peter Boyle
a8309638d4 UVM check in MPI calls 2020-09-03 20:29:26 -04:00
Peter Boyle
bcd7895362 Include cuda.h 2020-09-03 15:49:13 -04:00
Peter Boyle
6c5fa8dcd8 Aligned allocate on CPU put through this interface 2020-06-20 14:34:29 -04:00
Peter Boyle
0d2f913a1a String.h for linux 2020-06-20 09:37:31 -04:00
Peter Boyle
ee63721bad int unhappiness sycl fix 2020-05-25 08:36:24 -07:00
Peter Boyle
32be2b13d3 Updates for HiP 2020-05-24 14:00:55 -04:00
Peter Boyle
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle
ebb60330c9 Automatic data motion options beginning 2020-05-17 16:34:25 -04:00
Peter Boyle
d24d8e8398 Use X-direction as more bits meaningful on CUDA.
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume

e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.

Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
Peter Boyle
07c0c02f8c Speed up Cshift 2020-05-11 17:02:01 -04:00
Peter Boyle
bbbee5660d First compiile on HiP 2020-05-10 05:28:09 -04:00
Peter Boyle
52081acfa5 NVCC compile fixes 2020-05-08 13:14:12 -04:00
Peter Boyle
f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
Aim to reduce the amount of cuda and other code variations floating around all over the place.

Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00