1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-09 23:45:36 +00:00
Grid/HMC
Christopher Kelly 33e4a0caee Imported changes from feature/gparity_HMC branch:
Rework of WilsonFlow class
		Fixed logic error in smear method where the step index was initialized to 1 rather than 0, resulting in the logged output value of tau being too large by epsilon
		Previously smear_adaptive would maintain the current value of tau as a class member variable whereas smear would compute it separately; now both methods maintain the current value internally and it is updated by the evolve_step routines. Both evolve methods are now const.
		smear_adaptive now also maintains the current value of epsilon internally, allowing it to be a const method and also allowing the same class instance to be reused without needing to be reset
		Replaced the fixed evaluation of the plaquette energy density and plaquette topological charge during the smearing with a highly flexible general strategy where the user can add arbitrary measurements as functional objects that are evaluated at an arbitrary frequency
	        By default the same plaquette-based measurements are performed, but additional example functions are provided where the smearing is performed with different choices of measurement that are returned as an array for further processing
		Added a method to compute the energy density using the Cloverleaf approach which has smaller discretization errors
	Added a new tensor utility operation, copyLane, which allows for the copying of a single SIMD lane between two instances of the same tensor type but potentially different precisions
	To LocalCoherenceLanczos, added the option to compute the high/low eval of the fine operator on every restart to aid in tuning the Chebyshev
	Added Test_field_array_io which demonstrates and tests a single-file write of an arbitrary array of fields
	Added Test_evec_compression which generates evecs using Lanczos and attempts to compress them using the local coherence technique
	Added Test_compressed_lanczos_gparity which demonstrates the local coherence Lanczos for G-parity BCs
	Added HMC main programs for the 40ID and 48ID G-parity lattices
2022-07-01 14:12:12 -04:00
..
Makefile.am Changes locally 2019-04-17 12:03:20 +01:00
Mobius2p1f_DD_RHMC.cc Updates for DDHMC 2022-04-05 16:24:34 -04:00
Mobius2p1f.cc remove namespace QCD from directory HMC 2019-08-20 15:21:23 +01:00
Mobius2p1fEOFA_F1.cc undo (most) whitespace changes in the two files HMC/Mobius2p1fEOFA{,_F1}.cc 2019-10-02 16:25:23 +01:00
Mobius2p1fEOFA.cc undo (most) whitespace changes in the two files HMC/Mobius2p1fEOFA{,_F1}.cc 2019-10-02 16:25:23 +01:00
Mobius2p1fIDSDRGparityEOFA_40ID.cc Imported changes from feature/gparity_HMC branch: 2022-07-01 14:12:12 -04:00
Mobius2p1fIDSDRGparityEOFA_48ID.cc Imported changes from feature/gparity_HMC branch: 2022-07-01 14:12:12 -04:00
Mobius2p1fRHMC.cc Cold start 2021-03-18 15:41:14 -04:00
README F1 ensemble running with 96%~ acceptance etc.. 2019-05-22 09:56:26 +01:00

********************************************************************
TODO: 
********************************************************************

i) Got mixed precision in 2f and EOFA force and action solves.
   But need mixed precision in the heatbath solve. Best for Fermop to have a "clone" method, to
   reduce the number of solver and action objects. Needed ideally for the EOFA heatbath.
   15% perhaps
   Combine with 2x trajectory length?

ii) Rational on EOFA HB  -- relax order
                         -- Test the approx as per David email

Resume / roll.sh 

----------------------------------------------------------------

- 16^3 Currently 10 traj per hour

- EOFA use a different derivative solver from action solver
- EOFA fix Davids hack to the SchurRedBlack guessing

*** Reduce precision/tolerance  in EOFA with second CG param.                          (10% speed up)
*** Force gradient - reduced precision solve for the gradient                          (4/3x speedup)


*** Need a plan for gauge field update for mixed precision in HMC                      (2x speed up)
    -- Store the single prec action operator.
    -- Clone the gauge field from the operator function argument.
    -- Build the mixed precision operator dynamically from the passed operator and single prec clone.

*** Mixed precision CG into EOFA portion         
*** Further reduce precision in forces to 10^-6 ?

*** Overall: a 3x or so is still possible => 500s -> 160s and 20 traj per hour on 16^3.

- Use mixed precision CG in HMC                           
- SchurRedBlack.h: stop use of operator function; use LinearOperator or similar instead.
- Or make an OperatorFunction for mixed precision as a wrapper

********************************************************************
* Signed off 2+1f HMC with Hasenbush and strange RHMC 16^3 x 32 DWF Ls=16 Plaquette 0.5883 ish
* Signed off 2+1f HMC with Hasenbush and strange EOFA 16^3 x 32 DWF Ls=16 Plaquette 0.5883 ish
* Wilson plaquette cross checked against CPS and literature GwilsonFnone
********************************************************************

********************************************************************
* RHMC: Timesteps & eigenranges matched from previous CPS 16^3 x 32 runs:
********************************************************************

****
Strange (m=0.04)  has eigenspan 
**** 
16^3 done as 1+1+1 with separate PV's. 
/dirac1/archive/QCDOC/host/QCDDWF/DWF/2+1f/16nt32/IWASAKI/b2.13/ls16/M1_8/ms0.04/mu0.01/rhmc_multitimescale/evol5/work
****
2+1f 16^3  - [ 4e^-4, 2.42 ]    for strange

****
24^3 done as 1+1+1 at strange, and single quotient https://arxiv.org/pdf/0804.0473.pdf Eq 83,
****
double lambda_low =   4.0000000000000002e-04 <- strange
double lambda_low =   1.0000000000000000e-02 <- pauli villars
And high = 2.5

Array bsn_mass[3] = { 
double bsn_mass[0] =   1.0000000000000000e+00
double bsn_mass[1] =   1.0000000000000000e+00
double bsn_mass[2] =   1.0000000000000000e+00
}
Array frm_mass[3] = { 
double frm_mass[0] =   4.0000000000000001e-02
double frm_mass[1] =   4.0000000000000001e-02
double frm_mass[2] =   4.0000000000000001e-02
}

***
32^3 
/dirac1/archive/QCDOC/host/QCDDWF/DWF/2+1f/32nt64/IWASAKI/b2.25/ls16/M1_8/ms0.03/mu0.004/evol6/work
***
Similar det scheme
double lambda_low =   4.0000000000000002e-04
double lambda_low =   1.0000000000000000e-02

Array bsn_mass[3] = { 
double bsn_mass[0] =   1.0000000000000000e+00
double bsn_mass[1] =   1.0000000000000000e+00
double bsn_mass[2] =   1.0000000000000000e+00
}
Array frm_mass[3] = { 
double frm_mass[0] =   3.0000000000000002e-02
double frm_mass[1] =   3.0000000000000002e-02
double frm_mass[2] =   3.0000000000000002e-02
}

********************************************************************
* Grid: Power method bounds check
********************************************************************
- Finding largest eigenvalue approx 25 not 2.5
- Conventions:

Grid MpcDagMpc based on:

   (Moo-Moe Mee^-1 Meo)^dag(Moo-Moe Mee^-1 Meo)

- with  Moo = 5-M5 = 3.2
- CPS use(d) Moo = 1
- Eigenrange in Grid is 3.2^2 rescaled so factor of 10 accounted for