Peter Boyle
b01e67bab1
coalescedReadGeneralPermute now working
2023-10-02 17:46:57 -04:00
Peter Boyle
8a70314f54
Merge branch 'develop' into feature/scidac-wp1
2023-10-02 17:24:55 -04:00
Peter Boyle
afc316f501
Rename headers
2023-10-02 16:25:11 -04:00
Peter Boyle
f14bfd5c1b
Relocate sub includes
2023-10-02 16:23:38 -04:00
Peter Boyle
c5f1420dea
Merge remote-tracking branch 'LupoA/develop' into LupoA-develop
2023-10-02 16:22:35 -04:00
Peter Boyle
018e6da872
Merge pull request #440 from giltirn/feature/paddedcellgauge
...
Feature/paddedcellgauge
2023-10-02 10:00:42 -04:00
Peter Boyle
b77bccfac2
Merge pull request #444 from mmphys/feature/docX
...
Update doc complete list of Macports needed to build Grid on a fresh Mac
2023-10-02 09:57:11 -04:00
Peter Boyle
36ae6e5aba
Fastest GPU version.
...
Need to work on the PaddedCell now to make much faster
2023-09-29 18:26:51 -04:00
Peter Boyle
9db585cfeb
Temporary commit while optimisation is carried out
2023-09-29 17:11:35 -04:00
Peter Boyle
c564611ba7
Annoying hack that is useful to preserve for profiling
2023-09-29 17:11:12 -04:00
Peter Boyle
e187bcb85c
Updating
2023-09-29 17:10:17 -04:00
Peter Boyle
be18ffe3b4
Further tuning and lanczos
2023-09-27 16:21:58 -04:00
Peter Boyle
0d63dce4e2
Timing info
2023-09-27 16:21:14 -04:00
Peter Boyle
26b30e1551
Flop count and projection to nearest neighbour (keeps redundant flops)
2023-09-27 16:20:11 -04:00
Peter Boyle
7fc58ac293
Verbose subspace init
2023-09-27 16:19:45 -04:00
Peter Boyle
3a86cce8c1
Compile
2023-09-27 16:19:18 -04:00
Peter Boyle
80359e0d49
Bland SYCL compile
2023-09-26 13:20:27 -07:00
Peter Boyle
3d437c5cc4
Making SYCL happy
2023-09-26 13:19:42 -07:00
Peter Boyle
37884d369f
Coarse space is expensive, but gives a speed up in fine matrix multiplies now.
...
Down to optimisation
2023-09-25 17:24:19 -04:00
Peter Boyle
9246e653cd
Basic non-local coarsening of operator test
2023-09-25 17:20:58 -04:00
Peter Boyle
64283c8673
Normal equations becomes linear function for easy base class pass aroudn
2023-09-25 17:19:39 -04:00
Peter Boyle
755002da9c
Comparison convenience
2023-09-25 17:16:33 -04:00
Peter Boyle
31b8e8b437
Better messaging
2023-09-25 17:16:14 -04:00
Peter Boyle
0ec0de97e6
Adef2 implemented and working in an HDCG like context
2023-09-25 17:15:03 -04:00
Peter Boyle
6c3ade5d89
Improved the coarsening
2023-09-25 17:14:40 -04:00
Peter Boyle
980c5f9a34
Update chebyshev setup
2023-09-25 17:12:22 -04:00
Peter Boyle
471ca5f281
Power method more iterations
2023-09-07 10:55:05 -04:00
Peter Boyle
e82ddcff5d
Working getting closer to HDCG but some low level engineering work still needed
...
+ MUCH work on optimisation
2023-09-07 10:53:51 -04:00
Peter Boyle
b9dcad89e8
Test cases for coarsening with non-local stencil
2023-09-07 10:53:22 -04:00
Peter Boyle
993f43ef4a
Even odd use case
2023-09-07 10:53:06 -04:00
Peter Boyle
2b43308208
First cut non-local coarsening
2023-08-25 17:38:07 -04:00
Peter Boyle
04a1ac3a76
First cut for non-local coarsening
2023-08-25 17:37:38 -04:00
Peter Boyle
990b8798bd
Merge remote-tracking branch 'refs/remotes/origin/develop' into develop
2023-08-25 17:36:45 -04:00
Peter Boyle
b334a73a44
Stencil improvement
2023-08-25 17:35:10 -04:00
Peter Boyle
5d113d1c70
Odd address sanitizer complain
2023-08-25 17:34:18 -04:00
Peter Boyle
c14977aeab
Random vector option for test purposes
2023-08-25 17:33:31 -04:00
Peter Boyle
3e94838204
Spread out improvement
2023-08-25 17:31:28 -04:00
Peter Boyle
c0a0b8ca62
NEON and address sanitiser
2023-08-25 17:30:30 -04:00
Peter Boyle
b8a7004365
Partial fraction test
2023-08-14 15:17:03 -04:00
Michael Marshall
bd56c95a6f
Update documentation with complete list of Macports needed to build Grid on a fresh Mac
2023-07-14 13:50:06 +01:00
Peter Boyle
994512048e
Merge pull request #439 from felixerben/bugfix/IRL_convergence
...
Bugfix/irl convergence
2023-07-12 16:32:26 -04:00
chillenzer
dbd8bb49dc
Merge pull request #32 from LupoA/sp2n/develop
...
Sp2n/develop
2023-07-04 15:23:43 +00:00
Julian Lenz
3a29af0ce4
Fixed linker error
2023-07-04 16:08:44 +01:00
Julian Lenz
f7b79cdd45
Added test for ProjectSpn
2023-07-03 18:00:32 +01:00
Alessandro Lupo
075b9d22d0
adjoint rep implemented as 2indx symmetric
2023-07-02 13:58:31 +01:00
Alessandro Lupo
b92428f05f
better test
2023-07-02 13:34:03 +01:00
Alessandro Lupo
34b11864b6
prettiest tests
2023-07-02 13:25:57 +01:00
Christopher Kelly
1dfaa08afb
The stencils for the staple and rect-staple padded cell implementations are now created and stored by workspace classes that allow for reuse providing the grids remain consistent
...
The workspaces are now used by the plaq+rectangle gauge action resulting in a further 2x performance improvement as measured on a 16^4 local volume for 2 nodes (16 ranks) of Crusher
2023-06-28 15:11:24 -04:00
Christopher Kelly
f44dce390f
Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding
...
Fixed const correctness on PaddedCell methods
Fixed compile issues on Crusher
Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance
Optimized kernel for StaplePadded
Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings
2023-06-27 14:58:10 -04:00
Christopher Kelly
bb71e9a96a
Added PaddedCell and GeneralisedLocalStencil header includes to standard base headers
...
Moved versions of the padded-cell implementations of staple and rect-staple from test code to WilsonLoops header
Added StapleAndRectStapleAll which is now called by the plaq+rectangle action class. Under the hood it uses the padded cell implementations with maximal reuse of the padded gauge links
2023-06-27 11:23:30 -04:00