Added storage of final true residual in mixed-prec CG and enhanced log output
Fixed const correctness of multi-shift constructor
Added a mixed precision variant of the multi-shift algorithm that uses a single precision operator and applies periodic reliable update to the residual
Added tests/solver/Test_dwf_multishift_mixedprec to test the above
Fixed local coherence lanczos using the (large!) max approx to the chebyshev eval as the scale from which to judge the quality of convergence, resulting a test that always passes
Added a method to local coherence lanczos class that returns the fine eval/evec pair
Added iterative log output to power method
Added optional disabling of the plaquette check in Nerscio to support loading old G-parity configs which have a factor of 2 error in the plaquette
G-parity Dirac op no longer allows GPBC in the time direction; instead we toggle between periodic and antiperiodic
Replaced thread_for G-parity 5D force insertion implementation with accelerator_for version capable of running on GPUs
Generalized tests/lanczos/Test_dwf_lanczos to support regular DWF as well as Gparity, with the action chosen by a command line option
Modified tests/forces/Test_dwf_gpforce,Test_gpdwf_force,Test_gpwilson_force to use GPBC a spatial direction rather than the t-direction, and antiperiodic BCs for time direction
tests/core/Test_gparity now supports using APBC in time direction using command line toggle
Audited the code conventions (again) with the CPS momentum denominator
and added anti periodic in time to the Test_mobius_force.cc and
tested the Test_dwf_gpforce.
Promoted thesee to test full HMC hamiltonian, tr P^2/2 + phidag MdagM phi
with the same pdot and Udot as audited in the Integrator.h etc...
With full comments and sources for factors.
* develop: (26 commits)
Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
Correct misleading ac help string
Enable performance counting in WilsonFermion like in others
changed back A2AUtils warning
changed if and accelerator_for - no runtime errors any more
Mac OS (Darwin) sed -i flag for in-place editing differs from posix / gnu
Seems the intention with AutoConf produced Grid/Config.h was to use sed to translate standard PACKAGE_ #defines into GRID_ however due to missing '' after -i this hasn't been working. Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons.
Fixed compile issues with maxLocalNorm2 for non-scalar lattices maxLocalNorm2 test now reuses the random field
MADWF 5d source option for hadrons - look at Grid of source Abort on GPU error
maxLocalNorm2()
change back benchmark_ITT
prettify
Flop cout matches DiRAC-ITT-2020
revert changes
merge develop
fixes
weird bug in 2pt function...
revert changes
final version, tested on CPU and GPU
bugfix
...
Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC
Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
* develop: (27 commits)
Update README.md
result layout standardised, iterator size more elegant
updated syntac in Test_hadrons_spectrum
chroma-regression test now prints difference correctly
baryon input strings are now pairs of pairs of gammas - still ugly!!
second update to pull request
Changing back interface for Gamma3pt
Removing old debug code
Changes to A2Autils
suggested changes for 1st pull request implemented
changed input parameters for easier use
Should compile everywhere now
changed baryon interface
added author information
ready for pull request
code compiling now - still need to test
Baryons module works in 1 of 3 cases - still need SlicedProp and Msource part!!
thread_for caused the problems - slow for loop for now
still bugfix
weird bug...
...
# Conflicts:
# Hadrons/Modules.hpp
# Hadrons/modules.inc
This compiles and looks right ... but may need some testing
* develop: (762 commits)
Tensor ambiguous fix
Fix for GCC preprocessor/pragma handling bug
Trips up NVCC for reasons I dont understand on summit
Fix GCC complaint
Zero() change
Force a couple of things to compile on NVCC
Remove debug code
nvcc error suppress
Merge develop
Reduction finished and hopefully fixes CI regression fail on single precisoin and force
Double precision variants for summation accuracy
Update todo list
Freeze the seed
Fix compiling of MSource::Gauss for single precision
Think the reduction is now sorted and cleaned up
Fix force term
Printing improvement
GPU reduction fix and also exit backtrace option
GPU friendly
Simplify the comms benchmark
...
# Conflicts:
# Grid/communicator/SharedMemoryMPI.cc
# Grid/qcd/action/fermion/WilsonKernelsAsm.cc
# Grid/qcd/action/fermion/implementation/StaggeredKernelsAsm.h
# Grid/qcd/smearing/StoutSmearing.h
# Hadrons/Modules.hpp
# Hadrons/Utilities/Contractor.cc
# Hadrons/modules.inc
# tests/forces/Test_dwf_force_eofa.cc
# tests/forces/Test_dwf_gpforce_eofa.cc