Peter Boyle
286c29d6fb
Add Test_reduction to tests/debug
...
Tests the new CUB/hipCUB/SYCL lattice reduction (sum_gpu) against the
preserved hand-rolled implementation (sum_gpu_old) for LatticeComplexF/D,
LatticeColourMatrixF/D and LatticePropagatorF/D.
Part a) gaussian random field: checks that old and new agree to within
float/double roundoff tolerance.
Part b) constant field (= 1.0, identity-matrix init): verifies
innerProduct(sum, sum) = Ncomp * V^2 where Ncomp counts the nonzero
diagonal scalar components per site (1 / Nc / Ns*Nc respectively).
Make.inc is auto-generated by scripts/filelist on bootstrap and is not
tracked; the new .cc file is all that is needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-15 14:31:33 -04:00
Peter Boyle
0d8658a039
Optimised
2026-03-05 06:06:32 -05:00
Peter Boyle
76fbcffb60
Improvement to 16^3 hdcg
2026-03-05 06:06:32 -05:00
Peter Boyle
6ff29f9d4f
Alternate multigrids
2026-02-13 17:25:45 -05:00
Peter Boyle
7cd3f21e6b
preserving a bunch of experiments on setup and g5 subspace doubling
2026-01-06 05:57:39 -05:00
paboyle
9e6a4a4737
Assertion updates to macros (mostly) with backtrace.
...
WIlson flow to include options for DBW2, Iwasaki, Symanzik.
View logging for data assurance
2025-08-07 15:48:38 +00:00
Peter Boyle
677b4cc5b0
Make all tests compile
2025-04-24 20:33:26 -04:00
Peter Boyle
6fec3c15ca
Cleaner printing
2025-04-04 18:35:06 -04:00
Peter Boyle
c74d11e3d7
PVdagM MG
2025-02-01 11:04:13 -05:00
Peter Boyle
3f3661a86f
Heading towards PVdagM multigrid
2025-01-17 14:33:35 +00:00
Peter Boyle
2a9cfeb9ea
New files
2024-09-26 14:23:29 -04:00
Peter Boyle
575eb72182
Converges on 16^3
2024-08-27 19:20:38 +00:00
Peter Boyle
29f6b8a74a
Setup
2024-08-27 12:02:49 -04:00
Peter Boyle
9779aaea33
16^3 optimise
2024-08-27 11:38:35 -04:00
Peter Boyle
ec25604a67
Fastest solver for mrhs multigrid
2024-08-27 11:32:34 -04:00
Peter Boyle
b461184797
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2024-07-23 09:53:58 -04:00
Peter Boyle
486412635a
8^4 test for PETSc
2024-07-22 15:25:17 -04:00
Peter Boyle
8b23a1546a
Force compile temporarily
2024-07-22 15:24:56 -04:00
Peter Boyle
a901e4e369
Regressed performance for paper
2024-07-22 15:24:04 -04:00
Peter Boyle
804d9367d4
Regressed performance
2024-07-22 15:23:25 -04:00
Peter Boyle
7c246606c1
Schur additional case
2024-07-10 22:04:32 +00:00
Peter Boyle
12b8be7cb9
Best so far on 96^3 350 Evecs converged on 4^4 block
2024-06-18 16:31:37 -04:00
Peter Boyle
dc80b08969
96^3 test
2024-06-10 15:07:29 -04:00
Peter Boyle
0e607a55e7
Updated for 8^4 test
2024-05-26 20:53:05 +00:00
Peter Boyle
ad14a82742
Working aas good as possible on 48^3 in double
2024-05-16 10:55:45 -04:00
Peter Boyle
98cf247f33
prepare to switch to mixed precision
2024-04-30 05:23:45 -04:00
Peter Boyle
0cf16522d1
Refine with HDCG choice
2024-04-30 05:22:14 -04:00
Peter Boyle
5147a42818
Updated hdcg
2024-04-05 01:05:57 -04:00
Peter Boyle
5b79d51c22
Improvements
2024-04-01 14:18:40 -04:00
Peter Boyle
cc04dc42dc
Merge branch 'develop' into feature/scidac-wp1
2024-03-06 14:55:21 -05:00
Peter Boyle
070b61f08f
Simplifying the MultiRHS solver to make it do SRHS *and* MRHS
2024-03-06 14:04:33 -05:00
Peter Boyle
cd15abe9d1
Mrhs prep
2024-02-27 11:41:13 -05:00
Peter Boyle
eb702f581b
Running on 12 rhs on 18 nodes of frontier
2024-01-22 17:44:15 -05:00
Peter Boyle
d967eb53de
Working for first time
2024-01-17 16:31:12 -05:00
Peter Boyle
25f71913b7
MultiRHS coarse
2024-01-04 12:01:17 -05:00
Peter Boyle
d5fd90b2f3
Add 48^3 rtest
2024-01-04 12:00:01 -05:00
Peter Boyle
22c611bd1a
Delete temp file
2023-12-21 18:32:31 -05:00
Peter Boyle
c9bb1bf8ea
Passing new BLAs based
2023-12-21 18:31:17 -05:00
Peter Boyle
9e489887cf
General coarse multiRHS move to BLAS implementation
2023-12-21 15:24:48 -05:00
Peter Boyle
abcd6b8cb6
Faster version
2023-12-19 15:17:46 -05:00
Peter Boyle
6835a7f208
Better logging, test on 81 point stencil
2023-11-29 19:20:47 -05:00
Peter Boyle
f59993b979
Nbasis§
2023-11-29 09:47:36 -05:00
Peter Boyle
e859a199df
Reduce volume to interior for coarse stencil -- worth up to 4x gain
2023-11-28 10:23:16 -05:00
Peter Boyle
0a3682ad0b
MultiRHS work
2023-11-28 07:43:37 -05:00
Peter Boyle
59abaeb5cd
Time stamp
2023-11-24 12:56:45 -05:00
Peter Boyle
b302ad3d49
multiRHS test in place, passes Yay!
2023-11-23 18:20:15 -05:00
Peter Boyle
09946cf1ba
Improved, works on 48^3 moving to multiRHS optimisations
2023-11-15 18:03:05 -05:00
Peter Boyle
9c9c42d0df
Tests on frontier with real speed up . 3.5x on 16^3 at mq=0.01
2023-10-20 19:27:13 -04:00
Peter Boyle
0ae4478cd9
Checkpoint the subspace and ldop
2023-10-20 19:27:13 -04:00
Peter Boyle
ae4e705e09
Use random vec as easier for debug
2023-10-20 19:27:13 -04:00