1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-05-04 17:45:57 +01:00

18 Commits

Author SHA1 Message Date
066544281f Deprecate UVM 2024-09-17 13:34:27 +00:00
Peter Boyle
33097681b9 FTHMC compiled and merged to develop 2023-10-14 00:42:55 +03:00
Peter Boyle
5068413cdb Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2023-03-28 08:35:38 -07:00
Peter Boyle
71c6960eea Commet 2023-03-28 08:34:24 -07:00
Peter Boyle
d8a9a745d8 stream synchronise 2023-03-24 15:40:30 -04:00
Peter Boyle
d0bb033ea2 Device resident GPU block buffer instead of UVM as hit likely UVM
bug. Code worked on CUDA 11.4 but fails on later drivers (certainly 530.30.02, but need to
find the perlmutter driver version).
2023-03-22 19:07:32 -04:00
Peter Boyle
551a5f8dc8 RRII gpu option 2022-10-11 14:44:55 -04:00
d1decee4cc Cleaned up unused variables in Lattice_reduction_gpu.h 2022-03-02 16:54:23 +00:00
d4ae71b880 sum_gpu_large and sum_gpu templates added. 2022-03-02 15:40:18 +00:00
Peter Boyle
3e882f555d Large / small sumD options 2022-03-01 08:54:45 -05:00
Peter Boyle
42d56ea6b6 Verbosity 2021-10-29 02:23:08 +01:00
Peter Boyle
0b905a72dd Better reduction for GPUs 2021-10-29 02:22:22 +01:00
1f9688417a Error message added when attempting to sum object which is too large for
the shared memory
2021-10-13 20:45:46 +01:00
Peter Boyle
288c615782 Hip improvements 2020-09-16 00:31:50 +01:00
Peter Boyle
92b342a477 Hip reduction too 2020-05-24 13:50:28 -04:00
Peter Boyle
3e49dc8a67 Reduction finished and hopefully fixes CI regression fail on single precisoin and force 2019-08-14 15:18:34 +01:00
Peter Boyle
ce97638bac Think the reduction is now sorted and cleaned up 2019-08-11 11:09:01 +01:00
Peter Boyle
9dad7a0094 Reproducible reduction and axpy_norm offload from Gianluca.
Hopefully get CG running entirely on GPU
2019-07-30 00:14:12 +01:00