1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-10 07:55:35 +00:00
Commit Graph

6452 Commits

Author SHA1 Message Date
Christopher Kelly
351eab02ae Comment fix 2021-03-22 14:39:17 -04:00
Christopher Kelly
feee5ccde2 Added Gparity flavour Pauli matrix algebra and associated tensor types mirroring strategy used for Gamma matrices
Added test program for the above
2021-03-03 15:39:41 -05:00
Christopher Kelly
e0f6a146d8 To DWF+I G-parity evolution code, added ability to specify number of MD steps in params and an optional usage mode that reads the config and checks the plaq/checksum agree then exits 2021-02-16 10:41:52 -05:00
Christopher Kelly
daa095c519 Fixed an obscure but reproducible hang in the RHMC caused by the bounds check being activated by a random number that wasn't synchronized over the nodes
HMC now also reports the "L-infinity norm" of the impulse, aka the largest site norm
2021-02-09 12:55:46 -05:00
Christopher Kelly
c2676853ca Merge branch 'bugfix/maxnorm2' into feature/gparity_HMC 2021-02-08 12:17:33 -05:00
Christopher Kelly
55de69a569 Fixed compile issues with maxLocalNorm2 for non-scalar lattices
maxLocalNorm2 test now reuses the random field
2021-02-08 12:03:16 -05:00
Peter Boyle
eda9ab487b MADWF 5d source option for hadrons - look at Grid of source
Abort on GPU error
2021-02-08 10:47:22 -05:00
Christopher Kelly
6a824033f8 Merge branch 'develop' into feature/gparity_HMC 2021-02-08 09:31:49 -05:00
Christopher Kelly
cee6a37639 Added a logging tag for HMC
As the integrator logger is active by default the cmdline option to activate had no effect. Changed option to *de*activate on request ("NoIntegrator")
Cleaned up generating rational approxs in the general RHMC code
As the tolerance of the rational approx is not related to the CG tolerance, regenerating approxs for MD and MC if they differ only by the CG tolerance is not necessary; this has been fixed
In DWF+I Gparity evolution code, added cmdline options to check the rational approximations and compute the lowest/highest eigenvalues of M^dagM for RHMC tuning
In the above, changed the integrator layout to a much simpler one that completes much faster; may need additional tuning
2021-02-08 09:30:35 -05:00
Peter Boyle
cd99edcc5f maxLocalNorm2() 2021-02-04 18:25:49 -05:00
Christopher Kelly
6cc3ad110c Improved logging output for RHMC bounds checks
In GenericHMCRunner, exposed functionality for initializing gauge fields and RNG for external use
2021-01-29 12:35:00 -05:00
Christopher Kelly
e6c6f82c52 Gparity DWF+I HMC main program now has option to specify parameter file 2021-01-27 11:18:41 -05:00
Christopher Kelly
d10d0c4e7f Merge branch 'develop' into feature/gparity_HMC 2021-01-25 15:13:29 -05:00
Christopher Kelly
9c106d625a Added HMC main program designed to reproduce the 16^3x32x16 DWF+I ensembles with beta=2.13 and Gparity BCs 2021-01-25 15:07:44 -05:00
Christopher Kelly
6795bbca31 Generalized GeneralEvenOddRatioRationalPseudoFermionAction such that the multi-shift CG algorithm can be overridden by derived classes
Added a mixed-precision variant of GeneralEvenOddRatioRationalPseudoFermionAction and a verification test against double prec class
Fixed non-const reference used in passing RHMC approx to multishift classes
2021-01-25 14:22:31 -05:00
Peter Boyle
69f1f04f74 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-01-21 21:39:59 -05:00
Peter Boyle
11a5fd09d6 Hot config 2021-01-21 21:39:41 -05:00
Peter Boyle
ff1fa98808 Fix for GPU conserveed current 2021-01-21 21:38:23 -05:00
Christopher Kelly
d161c2dc35 Improved formating of timing output in mixed-prec multishift
In test of mixed-prec multishift, added comparison against full double precision multishift both for timing and to cross-check the results
2021-01-20 15:42:06 -05:00
Christopher Kelly
7a06826cf1 Added option to NerscIO to disable exit on failing plaquette check allowing for circumvention of factor of 2 error in CPS-generated G-parity config headers
Adapted mixed-prec multi-shift test to new way to pass gauge BC directions and added cmdline option to perform the G-parity plaquette comparison with the corrected plaquette when loading config
2021-01-20 13:31:50 -05:00
Christopher Kelly
c3712b8e06 Merge branch 'develop' into feature/gparity_HMC 2021-01-20 11:48:52 -05:00
Christopher Kelly
901ee77b84 Mixed precision multishift test can now be performed with/without G-parity using cmdline check and can load a pregenerated configuration 2021-01-20 11:45:44 -05:00
Peter Boyle
b0339bc5a4 Merge branch 'feature/conjugate-bc-dirs' into develop 2021-01-15 09:28:39 -05:00
Peter Boyle
3c23a947cc Fixed test for very much non-unit det 2021-01-15 09:16:02 -05:00
Peter Boyle
56111bb823 Merge branch 'develop' into feature/conjugate-bc-dirs 2021-01-14 21:01:22 -05:00
Peter Boyle
99445673f6 Gparity fix, and plaquette IO 2021-01-14 21:00:36 -05:00
Peter Boyle
97a59643f7 Red black coarse space 2021-01-14 20:49:13 -05:00
Peter Boyle
579595f547 Red black on coarse space 2021-01-14 20:48:35 -05:00
Peter Boyle
281ac5fc12 Red black support on coars 2021-01-14 20:48:08 -05:00
Peter Boyle
d8fa903b02 G5 on coarse spaces 2021-01-14 20:47:28 -05:00
Peter Boyle
eaff0f3aeb Gamma5 on coaree spaces 2021-01-14 20:46:58 -05:00
Peter Boyle
e8e20c01b2 Coarsened vector test 2021-01-14 20:46:21 -05:00
Peter Boyle
a4afc3ea2a Red black coarse space 2021-01-14 20:44:16 -05:00
Christopher Kelly
1b84f59273 Added a mixed precision multishift algorithm for which the matrix multiplies are performed in single precision but the search directions are accumulated in double precision.
A reliable update step is performed at a tunable frequency to correct the residual. A final mixed-prec single-shift solve is performed on each pole to perform cleanup if necessary.
A test is provided to demonstrate the algorithm.
2021-01-06 12:24:44 -05:00
Christopher Kelly
1fb41a4300 Added copyLane function to Tensor_extract_merge.h which copies one lane of data from an input tensor object to a different lane of an output tensor object of potentially different precision
precisionChange lattice function now uses copyLane to remove need for temporary scalar objects, reducing register footprint and significantly improving performance
2021-01-06 11:50:56 -05:00
Christopher Kelly
287bac946f ConjugateGradientMixedPrec now stores final true residual and uses the precisionChange workspaces for improved efficiency 2021-01-06 09:50:41 -05:00
Christopher Kelly
80c14be65e Added core test to check precision change 2021-01-06 09:34:44 -05:00
Christopher Kelly
d7a2a4852d Reimplemented precisionChange to run on GPUs. A workspace containing the mapping table can be optionally precomputed and reused for improved performance. 2021-01-06 09:30:49 -05:00
Christopher Kelly
d185f2eaa7 OneFlavourEvenOddRatioRationalPseudoFermionAction now derives from GeneralEvenOddRatioRationalPseudoFermionAction, simply performs transcription of parameters 2020-12-23 16:26:10 -05:00
Christopher Kelly
813d4cd900 Added test program that ensures the generic checkerboarded RHMC (with parameters set appropriately) gives the same answer as the existing 1f code 2020-12-23 16:01:42 -05:00
Christopher Kelly
75c6c6b173 General RHMC pseudofermion action now allows for different rational approximations to be used in the MD and action evaluation 2020-12-23 11:19:26 -05:00
Christopher Kelly
220ad5e3ee Added more verbose log output to GeneralEvenOddRatioRationalPseudoFermionAction
In GeneralEvenOddRatioRationalPseudoFermionAction, setting the bounds check frequency to 0 now disables the check
2020-12-22 11:08:22 -05:00
Christopher Kelly
ba5dc670a5 Reimplemented GparityWilsonImpl::InsertForce5D to run efficiently on GPUs
Swapped order of templated tensor code and c-number specializations in Tensor_outer.h to fix compile issue with type deduction on Summit
2020-12-22 10:10:07 -05:00
Peter Boyle
3fe75bc7cb
Merge pull request #329 from nmeyer-ur/feature/a64fx-3
Revised dslash/dwf kernels for A64FX
2020-12-20 08:17:15 -05:00
Nils Meyer
45d49d8648 clean up 2020-12-19 03:35:18 +01:00
Nils Meyer
6013183361 removed Asm impls 2020-12-19 03:25:01 +01:00
Nils Meyer
4b882e8056 fixed lost bracket 2020-12-19 03:09:20 +01:00
Nils Meyer
3f9ae6e7e7 Merge branch 'develop' into feature/a64fx-3 2020-12-19 02:37:11 +01:00
Nils Meyer
909acd55cd vnum variant for prefetches 2020-12-19 02:00:22 +01:00
Nils Meyer
4dd9e39e0d up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1 2020-12-19 00:54:31 +01:00