1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-05-02 00:25:56 +01:00

51 Commits

Author SHA1 Message Date
Peter Boyle
fab1efb48c More britney logging improvements 2024-03-19 14:36:21 +00:00
Peter Boyle
95f3d69cf9 Extra hardware test hook 2024-03-12 20:09:37 +00:00
cf8632bbac Britney test option 2024-03-12 15:15:35 +00:00
976c3e9b59 Hack for flight logging CG inner products.
Can be made to work, but could put in some more serious infrastructure
for repro testing and blame attribution (Britney test) if necessary
2024-03-05 23:59:57 +00:00
dbollweg
9514035b87 refactor slicesum: slicesum uses GPU version by default now 2024-02-09 13:02:28 -05:00
dbollweg
ab2de131bd work towards sliceSum for sycl backend 2024-02-06 13:24:45 -05:00
dbollweg
caa5f97723 Add sliceSum gpu using cub/hipcub 2024-01-31 16:50:06 -05:00
Peter Boyle
2376156fbc Merge branch 'develop' into feature/dirichlet 2023-03-27 21:33:50 -07:00
Christoph Lehner
458c943987 merged upstream 2022-12-31 11:16:21 +02:00
Christoph Lehner
88015b0858 Split sum in rankSum and GlobalSum 2022-12-26 10:01:32 +01:00
Peter Boyle
dc747c54be Merge branch 'develop' into feature/dirichlet
Conflicts:
	Grid/qcd/action/fermion/WilsonCompressor.h
	Grid/stencil/Stencil.h
2022-12-13 08:24:58 -05:00
Peter Boyle
82e959f66c SYCL reduction 2022-11-08 12:45:25 -08:00
Peter Boyle
551a5f8dc8 RRII gpu option 2022-10-11 14:44:55 -04:00
Christopher Kelly
19da647e3c Added support for non-periodic gauge field implementations in the random gauge shift performed at the start of the HMC trajectory
(The above required exposing the gauge implementation to the HMC class through the Integrator class)
Made the random shift optional (default on) through a parameter in HMCparameters
Modified ConjugateBC::CshiftLink such that it supports any shift in  -L < shift < L rather than just +-1
Added a tester for the BC-respecting Cshift
Fixed a missing system header include in SSE4 intrinsics wrapper
Fixed sumD_cpu for single-prec types performing an incorrect conversion to a single-prec data type at the end, that fails to compile on some systems
2022-09-09 12:47:09 -04:00
Peter Boyle
1333319941 Tracing 2022-08-31 17:00:25 -04:00
Peter Boyle
588c2f3cb1 Faster axpy_norm and innerProduct 2022-07-01 09:44:58 -04:00
d4ae71b880 sum_gpu_large and sum_gpu templates added. 2022-03-02 15:40:18 +00:00
Peter Boyle
3e882f555d Large / small sumD options 2022-03-01 08:54:45 -05:00
Thomas Wurm
9e5fb52eb9 Put GlobalSum outside the slice loop 2021-03-08 13:53:34 +01:00
Christopher Kelly
55de69a569 Fixed compile issues with maxLocalNorm2 for non-scalar lattices
maxLocalNorm2 test now reuses the random field
2021-02-08 12:03:16 -05:00
Peter Boyle
cd99edcc5f maxLocalNorm2() 2021-02-04 18:25:49 -05:00
Peter Boyle
936c5ecf69 Reduction GPU no compile fix 2020-06-24 17:28:31 -04:00
Peter Boyle
22cfbdbbb3 Boost precision in inner products in single 2020-06-24 12:52:31 -04:00
Peter Boyle
cdf0a04fc5 Merge branch 'develop' into sycl 2020-06-09 04:00:12 -04:00
Peter Boyle
1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
Peter Boyle
f67830587f Accelerator loop use 2020-06-03 22:50:09 -04:00
Peter Boyle
cb0d1b3399 hopefullly fix buildd fail 2020-05-24 21:27:00 -04:00
Peter Boyle
d1f1ccc705 HIP changes 2020-05-24 21:18:49 -04:00
Peter Boyle
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle
bbbee5660d First compiile on HiP 2020-05-10 05:28:09 -04:00
Peter Boyle
28a1fcaaff First compile against SYCL 2020-05-05 11:13:27 -07:00
Christoph Lehner
04863f8f38 debug new AcceleratorView 2020-05-04 16:07:03 -04:00
Christoph Lehner
2a1387e992 rankInnerProduct 2020-05-03 17:27:11 -04:00
Christoph Lehner
ddb192bac7 re-work double precision promotion for summit 2020-04-30 16:09:57 -04:00
Christoph Lehner
091d5c605e towards more precise blocking 2020-04-17 04:25:28 -04:00
Daniel Richtmann
5fc8a273e7
Fused innerProduct + norm2 on first argument operation 2020-04-06 11:52:29 +02:00
Fionn O hOgain
5de9547db5 Removing old debug code 2019-10-08 15:51:28 +01:00
Peter Boyle
be37dfb6f8 Remove debug code 2019-08-15 01:31:40 +01:00
Peter Boyle
3e49dc8a67 Reduction finished and hopefully fixes CI regression fail on single precisoin and force 2019-08-14 15:18:34 +01:00
Peter Boyle
ce97638bac Think the reduction is now sorted and cleaned up 2019-08-11 11:09:01 +01:00
Peter Boyle
9117f61109 GPU friendly 2019-07-31 01:22:54 +01:00
Peter Boyle
9dad7a0094 Reproducible reduction and axpy_norm offload from Gianluca.
Hopefully get CG running entirely on GPU
2019-07-30 00:14:12 +01:00
Peter Boyle
b7e6d111d7 Thread loop changes. Need to offload this file 2019-06-15 07:59:10 +01:00
Peter Boyle
dc5024e88c The GPU reduction was not working for me and causing errors. Need to revisit.
Gianluca is working on deterministic reduction/
2019-06-08 13:39:11 +01:00
gfilaci
1a82533d22 fix inner product with thrust reduction 2019-05-14 15:35:54 +01:00
Peter Boyle
204a090497 Inner product is not working on GPU. Why? 2019-04-28 07:31:56 +01:00
Peter Boyle
c5e081d69c Re-Merge branch 'develop' into feature/gpu-port
Pull in Regensburg MultiGrid pull request
2019-01-03 01:50:16 +00:00
Peter Boyle
715babeac8 GPU reductions first cut; use thrust, non-reproducible. Inclusive scan can fix this if desired.
Local reduction to LatticeComplex and then further reduction.
2019-01-01 13:53:37 +00:00
Peter Boyle
422764757d Updates in tests to make all of Grid compile 2018-12-14 16:55:54 +00:00
Peter Boyle
b57a4d32aa Merge branch 'develop' into feature/gpu-port 2018-12-13 05:11:34 +00:00