mirror of
https://github.com/paboyle/Grid.git
synced 2025-06-18 07:47:06 +01:00
Simplifying the MultiRHS solver to make it do SRHS *and* MRHS
This commit is contained in:
43
TODO
43
TODO
@ -1,6 +1,44 @@
|
||||
- - Slice sum optimisation & A2A - atomic addition
|
||||
i) Clean up CoarsenedMatrix, GeneralCoarsenedMatrix, GeneralCoarsenedMatrixMultiRHS
|
||||
|
||||
-- Ideally want a SINGLE implementation that does MultiRHS **AND** works with one RHS.
|
||||
|
||||
-- -- Getting there. One RHS is hard due to vectorisation & hardwired coarse5d layout
|
||||
-- Compromise: Wrap it in a copy in/out for a slice.
|
||||
|
||||
-- Bad for Lanczos: need to do a BLOCK Lanczos instead. Longer term.
|
||||
|
||||
-- **** Make the test do ONLY the single RHS. ****
|
||||
-- I/O for the matrix elements required.
|
||||
-- Make the Adef2 build an eigenvector deflater and a block projector
|
||||
--
|
||||
|
||||
-- Work with Regensburg on tests.
|
||||
-- Plan interface preserving the coarsened matrix interface (??)
|
||||
|
||||
-- Move functionality from GeneralCoarsenedMatrix INTO GeneralCoarsenedMatrixMultiRHS -- DONE
|
||||
-- Don't immediately delete original
|
||||
-- Instead make the new one self contained, then delete.
|
||||
-- New DWF inverter test.
|
||||
|
||||
// void PopulateAdag(void)
|
||||
void CoarsenOperator(LinearOperatorBase<Lattice<Fobj> > &linop, Aggregation<Fobj,CComplex,nbasis> & Subspace) -- DONE
|
||||
ExchangeCoarseLinks();
|
||||
|
||||
iii) Aurora -- christoph's problem -- DONE
|
||||
Aurora -- Carleton's problem staggered.
|
||||
|
||||
iv) Dennis merge and test Aurora -- DONE (save test)
|
||||
|
||||
v) Merge Ed Bennet's request --DONE
|
||||
|
||||
vi) Repro CG -- get down to the level of single node testing via split grid test
|
||||
|
||||
|
||||
=========================
|
||||
|
||||
===============
|
||||
- - Slice sum optimisation & A2A - atomic addition -- Dennis
|
||||
- - Also faster non-atomic reduction
|
||||
- - Remaining PRs
|
||||
- - DDHMC
|
||||
- - MixedPrec is the action eval, high precision
|
||||
- - MixedPrecCleanup is the force eval, low precision
|
||||
@ -17,7 +55,6 @@ DDHMC
|
||||
-- Multishift Mixed Precision - DONE
|
||||
-- Pole dependent residual - DONE
|
||||
|
||||
|
||||
=======
|
||||
-- comms threads issue??
|
||||
-- Part done: Staggered kernel performance on GPU
|
||||
|
Reference in New Issue
Block a user