Peter Boyle
baa70d8ec9
Test_reduction: add timing benchmark for new vs old reduction paths
...
Reports us/call and GB/s for sum_gpu (CUB/sycl::reduction) and
sum_gpu_old (hand-rolled shared-memory) for each field type, with
5-call warmup and 100-call timed loop.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-18 12:31:13 -04:00
Peter Boyle
c0472aa0ec
Test_reduction: use separate float and double grids
...
Float fields require a grid constructed with vComplexF::Nsimd(); using
a double grid causes grid->_gsites to undercount the sites in float
vobjF, making the constant-field expected value wrong.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-18 12:09:35 -04:00
Peter Boyle
09552cfd73
Rename scalarNorm2 to squaredSum in Test_reduction.cc
...
The function computes |sum|^2 — the squared magnitude of an aggregate sum —
not a norm. squaredSum makes clear that squaring is applied to the sum, not
to individual site values before summing, distinguishing it from sumOfSquares
(the squared L2 norm).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-15 23:15:11 -04:00
Peter Boyle
286c29d6fb
Add Test_reduction to tests/debug
...
Tests the new CUB/hipCUB/SYCL lattice reduction (sum_gpu) against the
preserved hand-rolled implementation (sum_gpu_old) for LatticeComplexF/D,
LatticeColourMatrixF/D and LatticePropagatorF/D.
Part a) gaussian random field: checks that old and new agree to within
float/double roundoff tolerance.
Part b) constant field (= 1.0, identity-matrix init): verifies
innerProduct(sum, sum) = Ncomp * V^2 where Ncomp counts the nonzero
diagonal scalar components per site (1 / Nc / Ns*Nc respectively).
Make.inc is auto-generated by scripts/filelist on bootstrap and is not
tracked; the new .cc file is all that is needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-05-15 14:31:33 -04:00
Peter Boyle
595ceaac37
Include grid header and make the ENABLE correct
2026-03-11 17:24:44 -04:00
Peter Boyle
daf5834e8e
Fixing incorrect PR about disable fermion instantiations
2026-03-11 17:05:46 -04:00
Peter Boyle
0d8658a039
Optimised
2026-03-05 06:06:32 -05:00
Peter Boyle
76fbcffb60
Improvement to 16^3 hdcg
2026-03-05 06:06:32 -05:00
edbennett
1b56f6f46d
be able to skip compiling fermion instantiations altogether
2026-02-24 23:52:18 +00:00
Peter Boyle
6ff29f9d4f
Alternate multigrids
2026-02-13 17:25:45 -05:00
Peter Boyle
7cd3f21e6b
preserving a bunch of experiments on setup and g5 subspace doubling
2026-01-06 05:57:39 -05:00
Peter Boyle
2e684028de
Improvements
2025-11-14 18:12:27 -05:00
Peter Boyle
fe0db53842
FFT offload to GPU and MUCH faster comms.
...
40x speed up on Frontier
2025-08-21 16:45:38 -04:00
Peter Boyle
76c0ada1e1
Benchmark for En Hung
2025-08-21 16:45:38 -04:00
paboyle
9e6a4a4737
Assertion updates to macros (mostly) with backtrace.
...
WIlson flow to include options for DBW2, Iwasaki, Symanzik.
View logging for data assurance
2025-08-07 15:48:38 +00:00
paboyle
73af020f98
improved
2025-06-27 06:08:54 +00:00
Peter Boyle
3737a24096
Updated python output
2025-06-03 14:09:29 -04:00
Peter Boyle
5364d580c9
Output chirality, eigenvector density files and python source lego plot
2025-05-13 18:44:47 -04:00
Peter Boyle
677b4cc5b0
Make all tests compile
2025-04-24 20:33:26 -04:00
Peter Boyle
ab3de50d5e
Merge pull request #473 from UCL-ARC/gauge_action_deriv
...
WilsonGagueAction deriv
2025-04-24 14:39:10 -04:00
Chulwoo Jung
a957e7bfa1
Adding DWF evec Chirality measurement
2025-04-22 22:17:51 +00:00
Chulwoo Jung
cee4c8ce8c
Merge branch 'develop' of https://github.com/paboyle/Grid into specflow
2025-04-18 19:55:36 +00:00
Peter Boyle
6fec3c15ca
Cleaner printing
2025-04-04 18:35:06 -04:00
Mashy Green
d41542c64b
reverted sp2n test wilsonfundfermiongauge to original
2025-03-24 08:29:15 +00:00
Mashy Green
0000d2e558
Merge branch 'develop' into gauge_action_deriv
2025-03-10 08:35:57 +00:00
Muhammad Asif
b1ba209696
Latest upstream with np-su3 patch and modified Sp_WilsonFunfFermionGauge test to be small ( #22 )
...
Co-authored-by: Mashy Green <mashy@me.com >
merging no-su3 patch
2025-02-24 11:38:42 +00:00
Mashy Green
717f647418
added the WilsonFlow patch from upstream PR #471
2025-02-24 08:41:31 +00:00
Peter Boyle
c74d11e3d7
PVdagM MG
2025-02-01 11:04:13 -05:00
paboyle
c4fc972fec
Merge branch 'feature/deprecate-uvm' into develop
2025-01-31 16:32:36 +00:00
Chulwoo Jung
570b72a47b
Bugfix. Sorry!
2025-01-21 15:37:39 -05:00
Chulwoo Jung
a5798a89ed
Merge branch 'develop' into specflow
2025-01-21 12:13:24 -05:00
Peter Boyle
3f3661a86f
Heading towards PVdagM multigrid
2025-01-17 14:33:35 +00:00
Chulwoo Jung
f7e2f9a401
Checking in spectral flow and DWF/Mobius kernel eigenvalue measurement
2025-01-16 20:47:33 +00:00
Chulwoo Jung
2848a9b558
DWF Kernel lanczos working(?)
2025-01-16 01:29:56 +00:00
paboyle
8fe429346f
Dslash testing for reproduce
2024-11-11 23:11:11 +00:00
Peter Boyle
b91fc1b6b4
Merge branch 'feature/boosted' into feature/deprecate-uvm
...
Fixed boosted free field test
2024-10-28 16:53:09 -04:00
Peter Boyle
eafc150034
Test fft asserts
2024-10-23 16:46:26 -04:00
Peter Boyle
1e893af775
GPU happy
2024-10-23 14:52:15 -04:00
Peter Boyle
d9f430a575
Happy GPU
2024-10-23 14:51:16 -04:00
Peter Boyle
5ae77876a8
Meson field and Aslash field on GPU; some compiler warning removed
2024-10-18 19:08:06 -04:00
paboyle
5cc4f3241d
Meson field test
2024-10-18 15:42:30 +00:00
Peter Boyle
6815e138b4
Boosted fermion attempt
2024-10-17 18:37:33 +01:00
paboyle
03687c1d62
Final version of test, closer to original again
2024-10-15 14:35:17 +00:00
Peter Boyle
2a9cfeb9ea
New files
2024-09-26 14:23:29 -04:00
paboyle
066544281f
Deprecate UVM
2024-09-17 13:34:27 +00:00
paboyle
160969a758
UVM tester, doesn't turn up anything
2024-09-10 18:09:42 +00:00
Peter Boyle
575eb72182
Converges on 16^3
2024-08-27 19:20:38 +00:00
Peter Boyle
29f6b8a74a
Setup
2024-08-27 12:02:49 -04:00
Peter Boyle
9779aaea33
16^3 optimise
2024-08-27 11:38:35 -04:00
Peter Boyle
ec25604a67
Fastest solver for mrhs multigrid
2024-08-27 11:32:34 -04:00