|
783a66b348
|
Deterministic reduction please
|
2024-03-06 00:01:37 +00:00 |
|
Peter Boyle
|
3f1636637d
|
Merge pull request #453 from dbollweg/feature/sliceSum_gpu
Feature/slice sum gpu
|
2024-02-28 14:04:43 -05:00 |
|
Christoph Lehner
|
9f89486df5
|
remove unnecessary code path
|
2024-02-28 19:56:23 +01:00 |
|
Christoph Lehner
|
22b43b86cb
|
Make GPT test suite work with SYCL
|
2024-02-28 12:57:17 +01:00 |
|
Dennis Bollweg
|
6cd2d8fcd5
|
Replace cuda/hip memcpy with Grid functions
|
2024-02-26 09:55:07 -05:00 |
|
|
303b83cdb8
|
Scaling benchmarks, verbosity and MPICH aware in acceleratorInit()
For some reason Dirichlet benchmark fails on several nodes; need to
debug this.
|
2024-02-13 19:48:03 +00:00 |
|
Peter Boyle
|
33097681b9
|
FTHMC compiled and merged to develop
|
2023-10-14 00:42:55 +03:00 |
|
Peter Boyle
|
9626a2c7c0
|
Asynch handling
|
2023-10-13 18:21:56 +03:00 |
|
Peter Boyle
|
b4f2ca81ff
|
Copy queue and compute queue same as better concurrency
|
2023-04-11 12:18:21 -07:00 |
|
Peter Boyle
|
da503fef0e
|
Name change on barrier routine
|
2023-04-11 12:14:04 -07:00 |
|
Peter Boyle
|
4a382fad3f
|
Use distinct SYCL queue for copies
|
2023-04-04 07:41:41 -07:00 |
|
Peter Boyle
|
af64c1c6b6
|
Had managed to drop the accelerator_barrier() in the Wilson Compressor gather
|
2023-03-30 17:34:44 -04:00 |
|
Peter Boyle
|
866f48391a
|
Temporary fix for develop incorrect results
|
2023-03-30 17:10:13 -04:00 |
|
Peter Boyle
|
496d04cd85
|
Weaken the Fence
|
2023-03-29 18:58:51 -04:00 |
|
Peter Boyle
|
b5b759df73
|
Merge branch 'develop' into feature/dirichlet
|
2023-03-21 16:05:46 -04:00 |
|
Peter Boyle
|
861e5d7f4c
|
SYCL version update. Why do they keep making incompatible changes
|
2023-03-14 12:10:02 -07:00 |
|
Peter Boyle
|
03508448f8
|
Remove verbose
|
2022-10-04 11:12:15 -07:00 |
|
Peter Boyle
|
1177b8f661
|
Merge branch 'develop' into feature/dirichlet
|
2022-08-31 19:05:57 -04:00 |
|
Peter Boyle
|
95b640cb6b
|
10TF/s on 32^3 x 64 on single node
|
2022-08-04 15:43:52 -04:00 |
|
Peter Boyle
|
2cb5bedc15
|
Copy stream HIP improvements
|
2022-08-04 15:24:03 -04:00 |
|
Peter Boyle
|
188d2c7a4d
|
PVC default, ignore ATS
|
2022-08-02 08:38:53 -07:00 |
|
Peter Boyle
|
84110166e4
|
Fix the fence
|
2022-08-02 08:00:43 -07:00 |
|
Peter Boyle
|
d32b923b6c
|
Fencing on a stream in SYCL is needed. Didn't know that ... gulp
|
2022-08-02 07:58:04 -07:00 |
|
Peter Boyle
|
5f8892bf03
|
Mistake pointed out by Camilo
|
2022-07-19 09:31:51 -07:00 |
|
Peter Boyle
|
f14e7e51e7
|
Grid accelerator
|
2022-07-12 10:56:22 -07:00 |
|
Peter Boyle
|
3544965f54
|
Stream doesn't work
|
2022-07-07 17:49:20 +01:00 |
|
Peter Boyle
|
bd99fd608c
|
Introduce a non-default stream for compute operatoins
|
2022-07-01 09:42:53 -04:00 |
|
Peter Boyle
|
136d843ce7
|
Crusher updates
|
2022-05-25 12:36:09 -04:00 |
|
Peter Boyle
|
5012adfebf
|
Merge branch 'develop' into feature/dirichlet
|
2022-04-05 16:26:19 -04:00 |
|
Peter Boyle
|
92a83a9eb3
|
Performance improve for Tesseract
|
2022-03-16 17:14:36 +00:00 |
|
Peter Boyle
|
5340e50427
|
HMC running with new formulation
|
2022-03-01 17:10:25 -05:00 |
|
Peter Boyle
|
e16fc5b2e4
|
Threaded intranode comms transfer - ideally between NUMA domains
|
2022-03-01 11:17:24 -05:00 |
|
Julio Maia
|
86f4e17928
|
Changing thread block order and adding launch_bounds
|
2022-02-07 11:29:37 -06:00 |
|
Peter Boyle
|
7f7d06d963
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2021-12-07 09:06:42 -08:00 |
|
Peter Boyle
|
2bf3b4d576
|
Update to reduce memory footpring in benchmark test
|
2021-12-07 09:02:02 -08:00 |
|
Peter Boyle
|
6ceb556684
|
Intranode asynch hipMemCopy
|
2021-11-22 20:45:12 -05:00 |
|
Peter Boyle
|
76cde73705
|
HIP improvements on messaging and intranode hipMemCopyAsynch
|
2021-11-22 20:44:39 -05:00 |
|
Peter Boyle
|
16c2a99965
|
Overlap cudamemcpy - didn't set up stream right
|
2021-10-11 13:31:26 -07:00 |
|
Peter Boyle
|
ab6ea29913
|
Print removal
|
2021-10-05 20:13:25 -04:00 |
|
Peter Boyle
|
8ed0b57b09
|
Memory verbose and tracking, shrink default cache
Print PCI device IDs on node 0
|
2021-10-05 11:41:03 -04:00 |
|
Peter Boyle
|
3206f69478
|
SYCL happy
|
2021-09-21 18:01:35 -07:00 |
|
Peter Boyle
|
8eb1232683
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2021-09-21 09:25:07 -07:00 |
|
Peter Boyle
|
b3b033d343
|
Clean
|
2021-09-21 09:18:54 -07:00 |
|
Peter Boyle
|
814d5abc7e
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2021-09-21 04:05:51 +02:00 |
|
Peter Boyle
|
1fb6aaf150
|
Device 2 Device with cudaMemcpy
|
2021-09-21 01:03:07 +02:00 |
|
Peter Boyle
|
ea7126496d
|
Merge pull request #361 from edbennett/fix-setdevice-message
make message about setdevice consistent with configure script
|
2021-09-16 10:23:37 -04:00 |
|
Peter Boyle
|
0d588b95f4
|
Bug fix to Example_Laplacian test
|
2021-08-23 23:14:26 +01:00 |
|
peterx.a.boyle
|
50181f16e5
|
Level 0 IPC set up
|
2021-08-10 05:35:15 -07:00 |
|
Ed Bennett
|
323cf6c038
|
make message consistent with configure script
|
2021-06-23 17:00:43 +01:00 |
|
Peter Boyle
|
403bff1a47
|
Force reqd subgroup size fo SYCL
|
2021-06-22 17:56:10 +00:00 |
|