Christopher Kelly
351eab02ae
Comment fix
2021-03-22 14:39:17 -04:00
Christopher Kelly
feee5ccde2
Added Gparity flavour Pauli matrix algebra and associated tensor types mirroring strategy used for Gamma matrices
...
Added test program for the above
2021-03-03 15:39:41 -05:00
Christopher Kelly
e0f6a146d8
To DWF+I G-parity evolution code, added ability to specify number of MD steps in params and an optional usage mode that reads the config and checks the plaq/checksum agree then exits
2021-02-16 10:41:52 -05:00
Christopher Kelly
daa095c519
Fixed an obscure but reproducible hang in the RHMC caused by the bounds check being activated by a random number that wasn't synchronized over the nodes
...
HMC now also reports the "L-infinity norm" of the impulse, aka the largest site norm
2021-02-09 12:55:46 -05:00
Christopher Kelly
c2676853ca
Merge branch 'bugfix/maxnorm2' into feature/gparity_HMC
2021-02-08 12:17:33 -05:00
Christopher Kelly
55de69a569
Fixed compile issues with maxLocalNorm2 for non-scalar lattices
...
maxLocalNorm2 test now reuses the random field
2021-02-08 12:03:16 -05:00
Peter Boyle
eda9ab487b
MADWF 5d source option for hadrons - look at Grid of source
...
Abort on GPU error
2021-02-08 10:47:22 -05:00
Christopher Kelly
6a824033f8
Merge branch 'develop' into feature/gparity_HMC
2021-02-08 09:31:49 -05:00
Christopher Kelly
cee6a37639
Added a logging tag for HMC
...
As the integrator logger is active by default the cmdline option to activate had no effect. Changed option to *de*activate on request ("NoIntegrator")
Cleaned up generating rational approxs in the general RHMC code
As the tolerance of the rational approx is not related to the CG tolerance, regenerating approxs for MD and MC if they differ only by the CG tolerance is not necessary; this has been fixed
In DWF+I Gparity evolution code, added cmdline options to check the rational approximations and compute the lowest/highest eigenvalues of M^dagM for RHMC tuning
In the above, changed the integrator layout to a much simpler one that completes much faster; may need additional tuning
2021-02-08 09:30:35 -05:00
Peter Boyle
cd99edcc5f
maxLocalNorm2()
2021-02-04 18:25:49 -05:00
Christopher Kelly
6cc3ad110c
Improved logging output for RHMC bounds checks
...
In GenericHMCRunner, exposed functionality for initializing gauge fields and RNG for external use
2021-01-29 12:35:00 -05:00
Christopher Kelly
e6c6f82c52
Gparity DWF+I HMC main program now has option to specify parameter file
2021-01-27 11:18:41 -05:00
Christopher Kelly
d10d0c4e7f
Merge branch 'develop' into feature/gparity_HMC
2021-01-25 15:13:29 -05:00
Christopher Kelly
9c106d625a
Added HMC main program designed to reproduce the 16^3x32x16 DWF+I ensembles with beta=2.13 and Gparity BCs
2021-01-25 15:07:44 -05:00
Christopher Kelly
6795bbca31
Generalized GeneralEvenOddRatioRationalPseudoFermionAction such that the multi-shift CG algorithm can be overridden by derived classes
...
Added a mixed-precision variant of GeneralEvenOddRatioRationalPseudoFermionAction and a verification test against double prec class
Fixed non-const reference used in passing RHMC approx to multishift classes
2021-01-25 14:22:31 -05:00
Peter Boyle
69f1f04f74
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2021-01-21 21:39:59 -05:00
Peter Boyle
11a5fd09d6
Hot config
2021-01-21 21:39:41 -05:00
Peter Boyle
ff1fa98808
Fix for GPU conserveed current
2021-01-21 21:38:23 -05:00
Christopher Kelly
d161c2dc35
Improved formating of timing output in mixed-prec multishift
...
In test of mixed-prec multishift, added comparison against full double precision multishift both for timing and to cross-check the results
2021-01-20 15:42:06 -05:00
Christopher Kelly
7a06826cf1
Added option to NerscIO to disable exit on failing plaquette check allowing for circumvention of factor of 2 error in CPS-generated G-parity config headers
...
Adapted mixed-prec multi-shift test to new way to pass gauge BC directions and added cmdline option to perform the G-parity plaquette comparison with the corrected plaquette when loading config
2021-01-20 13:31:50 -05:00
Christopher Kelly
c3712b8e06
Merge branch 'develop' into feature/gparity_HMC
2021-01-20 11:48:52 -05:00
Christopher Kelly
901ee77b84
Mixed precision multishift test can now be performed with/without G-parity using cmdline check and can load a pregenerated configuration
2021-01-20 11:45:44 -05:00
Peter Boyle
b0339bc5a4
Merge branch 'feature/conjugate-bc-dirs' into develop
2021-01-15 09:28:39 -05:00
Peter Boyle
3c23a947cc
Fixed test for very much non-unit det
2021-01-15 09:16:02 -05:00
Peter Boyle
56111bb823
Merge branch 'develop' into feature/conjugate-bc-dirs
2021-01-14 21:01:22 -05:00
Peter Boyle
99445673f6
Gparity fix, and plaquette IO
2021-01-14 21:00:36 -05:00
Peter Boyle
97a59643f7
Red black coarse space
2021-01-14 20:49:13 -05:00
Peter Boyle
579595f547
Red black on coarse space
2021-01-14 20:48:35 -05:00
Peter Boyle
281ac5fc12
Red black support on coars
2021-01-14 20:48:08 -05:00
Peter Boyle
d8fa903b02
G5 on coarse spaces
2021-01-14 20:47:28 -05:00
Peter Boyle
eaff0f3aeb
Gamma5 on coaree spaces
2021-01-14 20:46:58 -05:00
Peter Boyle
e8e20c01b2
Coarsened vector test
2021-01-14 20:46:21 -05:00
Peter Boyle
a4afc3ea2a
Red black coarse space
2021-01-14 20:44:16 -05:00
Christopher Kelly
1b84f59273
Added a mixed precision multishift algorithm for which the matrix multiplies are performed in single precision but the search directions are accumulated in double precision.
...
A reliable update step is performed at a tunable frequency to correct the residual. A final mixed-prec single-shift solve is performed on each pole to perform cleanup if necessary.
A test is provided to demonstrate the algorithm.
2021-01-06 12:24:44 -05:00
Christopher Kelly
1fb41a4300
Added copyLane function to Tensor_extract_merge.h which copies one lane of data from an input tensor object to a different lane of an output tensor object of potentially different precision
...
precisionChange lattice function now uses copyLane to remove need for temporary scalar objects, reducing register footprint and significantly improving performance
2021-01-06 11:50:56 -05:00
Christopher Kelly
287bac946f
ConjugateGradientMixedPrec now stores final true residual and uses the precisionChange workspaces for improved efficiency
2021-01-06 09:50:41 -05:00
Christopher Kelly
80c14be65e
Added core test to check precision change
2021-01-06 09:34:44 -05:00
Christopher Kelly
d7a2a4852d
Reimplemented precisionChange to run on GPUs. A workspace containing the mapping table can be optionally precomputed and reused for improved performance.
2021-01-06 09:30:49 -05:00
Christopher Kelly
d185f2eaa7
OneFlavourEvenOddRatioRationalPseudoFermionAction now derives from GeneralEvenOddRatioRationalPseudoFermionAction, simply performs transcription of parameters
2020-12-23 16:26:10 -05:00
Christopher Kelly
813d4cd900
Added test program that ensures the generic checkerboarded RHMC (with parameters set appropriately) gives the same answer as the existing 1f code
2020-12-23 16:01:42 -05:00
Christopher Kelly
75c6c6b173
General RHMC pseudofermion action now allows for different rational approximations to be used in the MD and action evaluation
2020-12-23 11:19:26 -05:00
Christopher Kelly
220ad5e3ee
Added more verbose log output to GeneralEvenOddRatioRationalPseudoFermionAction
...
In GeneralEvenOddRatioRationalPseudoFermionAction, setting the bounds check frequency to 0 now disables the check
2020-12-22 11:08:22 -05:00
Christopher Kelly
ba5dc670a5
Reimplemented GparityWilsonImpl::InsertForce5D to run efficiently on GPUs
...
Swapped order of templated tensor code and c-number specializations in Tensor_outer.h to fix compile issue with type deduction on Summit
2020-12-22 10:10:07 -05:00
Peter Boyle
3fe75bc7cb
Merge pull request #329 from nmeyer-ur/feature/a64fx-3
...
Revised dslash/dwf kernels for A64FX
2020-12-20 08:17:15 -05:00
Nils Meyer
45d49d8648
clean up
2020-12-19 03:35:18 +01:00
Nils Meyer
6013183361
removed Asm impls
2020-12-19 03:25:01 +01:00
Nils Meyer
4b882e8056
fixed lost bracket
2020-12-19 03:09:20 +01:00
Nils Meyer
3f9ae6e7e7
Merge branch 'develop' into feature/a64fx-3
2020-12-19 02:37:11 +01:00
Nils Meyer
909acd55cd
vnum variant for prefetches
2020-12-19 02:00:22 +01:00
Nils Meyer
4dd9e39e0d
up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1
2020-12-19 00:54:31 +01:00