814d5abc7e
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2021-09-21 04:05:51 +02:00
1fb6aaf150
Device 2 Device with cudaMemcpy
2021-09-21 01:03:07 +02:00
ea7126496d
Merge pull request #361 from edbennett/fix-setdevice-message
...
make message about setdevice consistent with configure script
2021-09-16 10:23:37 -04:00
0d588b95f4
Bug fix to Example_Laplacian test
2021-08-23 23:14:26 +01:00
50181f16e5
Level 0 IPC set up
2021-08-10 05:35:15 -07:00
323cf6c038
make message consistent with configure script
2021-06-23 17:00:43 +01:00
403bff1a47
Force reqd subgroup size fo SYCL
2021-06-22 17:56:10 +00:00
0e27e3847d
Remove synch
2021-06-03 04:24:19 +00:00
4d1ea15c79
More verbosity. The 16bit limit on Grid.y, Grid.z is annoying
2021-03-09 04:29:37 +01:00
679d1d22f7
Sycl happier
2021-03-03 11:21:43 -08:00
f9b1f240f6
Better SIMD usage/coalescence
2021-02-26 17:51:41 +01:00
eda9ab487b
MADWF 5d source option for hadrons - look at Grid of source
...
Abort on GPU error
2021-02-08 10:47:22 -05:00
c61ea72949
Merge pull request #19 from paboyle/develop
...
Sync
2020-11-20 17:31:13 +01:00
86e8b9fe38
ALLOC_ALIGN removed
2020-11-20 17:07:16 +01:00
4ea8d128c2
Merge pull request #18 from paboyle/develop
...
Sync
2020-11-20 15:36:50 +01:00
9c4dcc5ea3
Merge branch 'master' into develop
2020-11-16 16:34:57 +01:00
cf23eff60e
Device to Device, Memset, cannot assume UVM == Communicable
2020-11-13 03:51:08 +01:00
6e313575be
Use of default GPU is behaviour, not a system property. Move Summit specific to configure.ac
2020-11-13 03:50:16 +01:00
00d0d6d008
Hip Free managed
2020-10-31 18:14:31 -04:00
80fd6ab407
Merge pull request #17 from paboyle/develop
...
sync upstream
2020-10-06 09:01:39 +02:00
81441e98f4
HIP runs sensible
2020-09-16 03:35:03 +01:00
ecd3f890f5
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-09-16 02:30:14 +01:00
dacbbdd051
Hip Happy Birthday
2020-09-16 00:37:02 +01:00
4677c40195
HIP improvements
2020-09-16 00:32:27 +01:00
5cffa05c7e
remove slab allocator file
2020-09-13 14:06:25 -04:00
d50a2164d7
remove slab allocator
2020-09-13 14:06:06 -04:00
32ff766dbd
fix evict scheme, slab alloc
2020-09-13 14:02:53 -04:00
01652d8cfe
SlabAllocator
2020-09-13 05:56:02 -04:00
51d1beb1f3
Merge pull request #15 from paboyle/develop
...
Sync with upstream
2020-09-07 14:20:33 +02:00
a8309638d4
UVM check in MPI calls
2020-09-03 20:29:26 -04:00
bcd7895362
Include cuda.h
2020-09-03 15:49:13 -04:00
2a75516330
state MPI/SLURM message only on world_rank zero
2020-08-26 12:34:17 -04:00
1efe30d6cc
SLurm stop nodes using same GPU
2020-08-21 02:02:53 +02:00
6c5fa8dcd8
Aligned allocate on CPU put through this interface
2020-06-20 14:34:29 -04:00
0d2f913a1a
String.h for linux
2020-06-20 09:37:31 -04:00
11bc1aeadc
TThread count defaultt to fastest
2020-06-19 14:30:35 -04:00
66005929af
Set up the cache size on all ranks
2020-06-19 12:50:54 -04:00
2b1e259441
Decode of SYCL devices fix
2020-06-04 17:16:55 -07:00
f39c2a240b
Priintinig and device memory size detection
2020-06-04 14:58:03 -04:00
e93e12b6a4
More verbose SYCL setup
2020-06-03 09:12:11 -04:00
ee63721bad
int unhappiness sycl fix
2020-05-25 08:36:24 -07:00
22c5168d70
Sycl happier
2020-05-25 08:35:56 -07:00
32be2b13d3
Updates for HiP
2020-05-24 14:00:55 -04:00
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
d24d8e8398
Use X-direction as more bits meaningful on CUDA.
...
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume
e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.
Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
07c0c02f8c
Speed up Cshift
2020-05-11 17:02:01 -04:00
bbbee5660d
First compiile on HiP
2020-05-10 05:28:09 -04:00
52081acfa5
NVCC compile fixes
2020-05-08 13:14:12 -04:00
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00