******************************************************************** TODO: ******************************************************************** i) Got mixed precision in 2f and EOFA force and action solves. But need mixed precision in the heatbath solve. Best for Fermop to have a "clone" method, to reduce the number of solver and action objects. Needed ideally for the EOFA heatbath. 15% perhaps Combine with 2x trajectory length? ii) Rational on EOFA HB -- relax order -- Test the approx as per David email Resume / roll.sh ---------------------------------------------------------------- - 16^3 Currently 10 traj per hour - EOFA use a different derivative solver from action solver - EOFA fix Davids hack to the SchurRedBlack guessing *** Reduce precision/tolerance in EOFA with second CG param. (10% speed up) *** Force gradient - reduced precision solve for the gradient (4/3x speedup) *** Need a plan for gauge field update for mixed precision in HMC (2x speed up) -- Store the single prec action operator. -- Clone the gauge field from the operator function argument. -- Build the mixed precision operator dynamically from the passed operator and single prec clone. *** Mixed precision CG into EOFA portion *** Further reduce precision in forces to 10^-6 ? *** Overall: a 3x or so is still possible => 500s -> 160s and 20 traj per hour on 16^3. - Use mixed precision CG in HMC - SchurRedBlack.h: stop use of operator function; use LinearOperator or similar instead. - Or make an OperatorFunction for mixed precision as a wrapper ******************************************************************** * Signed off 2+1f HMC with Hasenbush and strange RHMC 16^3 x 32 DWF Ls=16 Plaquette 0.5883 ish * Signed off 2+1f HMC with Hasenbush and strange EOFA 16^3 x 32 DWF Ls=16 Plaquette 0.5883 ish * Wilson plaquette cross checked against CPS and literature GwilsonFnone ******************************************************************** ******************************************************************** * RHMC: Timesteps & eigenranges matched from previous CPS 16^3 x 32 runs: ******************************************************************** **** Strange (m=0.04) has eigenspan **** 16^3 done as 1+1+1 with separate PV's. /dirac1/archive/QCDOC/host/QCDDWF/DWF/2+1f/16nt32/IWASAKI/b2.13/ls16/M1_8/ms0.04/mu0.01/rhmc_multitimescale/evol5/work **** 2+1f 16^3 - [ 4e^-4, 2.42 ] for strange **** 24^3 done as 1+1+1 at strange, and single quotient https://arxiv.org/pdf/0804.0473.pdf Eq 83, **** double lambda_low = 4.0000000000000002e-04 <- strange double lambda_low = 1.0000000000000000e-02 <- pauli villars And high = 2.5 Array bsn_mass[3] = { double bsn_mass[0] = 1.0000000000000000e+00 double bsn_mass[1] = 1.0000000000000000e+00 double bsn_mass[2] = 1.0000000000000000e+00 } Array frm_mass[3] = { double frm_mass[0] = 4.0000000000000001e-02 double frm_mass[1] = 4.0000000000000001e-02 double frm_mass[2] = 4.0000000000000001e-02 } *** 32^3 /dirac1/archive/QCDOC/host/QCDDWF/DWF/2+1f/32nt64/IWASAKI/b2.25/ls16/M1_8/ms0.03/mu0.004/evol6/work *** Similar det scheme double lambda_low = 4.0000000000000002e-04 double lambda_low = 1.0000000000000000e-02 Array bsn_mass[3] = { double bsn_mass[0] = 1.0000000000000000e+00 double bsn_mass[1] = 1.0000000000000000e+00 double bsn_mass[2] = 1.0000000000000000e+00 } Array frm_mass[3] = { double frm_mass[0] = 3.0000000000000002e-02 double frm_mass[1] = 3.0000000000000002e-02 double frm_mass[2] = 3.0000000000000002e-02 } ******************************************************************** * Grid: Power method bounds check ******************************************************************** - Finding largest eigenvalue approx 25 not 2.5 - Conventions: Grid MpcDagMpc based on: (Moo-Moe Mee^-1 Meo)^dag(Moo-Moe Mee^-1 Meo) - with Moo = 5-M5 = 3.2 - CPS use(d) Moo = 1 - Eigenrange in Grid is 3.2^2 rescaled so factor of 10 accounted for