mirror of
				https://github.com/paboyle/Grid.git
				synced 2025-10-26 01:29:34 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			304 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			304 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| TODO:
 | |
| ---------------
 | |
| 
 | |
| Peter's work list:
 | |
| 2)- Precision conversion and sort out localConvert      <-- 
 | |
| 3)- Remove DenseVector, DenseMatrix; Use Eigen instead. <-- started 
 | |
| 4)- Binary I/O speed up & x-strips
 | |
| -- Profile CG, BlockCG, etc... Flop count/rate -- PARTIAL, time but no flop/s yet
 | |
| -- Physical propagator interface
 | |
| -- Conserved currents
 | |
| -- GaugeFix into central location
 | |
| -- Multigrid Wilson and DWF, compare to other Multigrid implementations
 | |
| -- HDCR resume
 | |
| 
 | |
| Recent DONE 
 | |
| -- Cut down the exterior overhead                      <-- DONE
 | |
| -- Interior legs from SHM comms                        <-- DONE
 | |
| -- Half-precision comms                                <-- DONE
 | |
| -- Merge high precision reduction into develop        
 | |
| -- multiRHS DWF; benchmark on Cori/BNL for comms elimination
 | |
|    -- slice* linalg routines for multiRHS, BlockCG    
 | |
| 
 | |
| -----
 | |
| * Forces; the UdSdU  term in gauge force term is half of what I think it should
 | |
|   be. This is a consequence of taking ONLY the first term in:
 | |
| 
 | |
|   dSg/dt = dU/dt dSdU + dUdag/dt dSdUdag
 | |
| 
 | |
|   in the fermion force.
 | |
| 
 | |
|   Now, S_mom = - tr Pmu Pmu      ; Pmu anti-herm
 | |
| 
 | |
|                                   .
 | |
|        d Smom/dt = - 2.0 tr Pmu Pmu   = - dSg/dt = - tr Pmu [Umu dSdUmu + UmuDag dSdUmuDag]
 | |
| 
 | |
|            .
 | |
|        => Pmu =  Umu dSdUmu
 | |
| 
 | |
|        Where the norm is half expected.
 | |
| 
 | |
|   This means we must double the force in the Test_xxx_force routines, and is the origin of the factor of two.
 | |
|   This 2x is applied by hand in the fermion routines and in the Test_rect_force routine.
 | |
| 
 | |
| * Support different boundary conditions (finite temp, chem. potential ... )
 | |
| 
 | |
| - Sign of force term.
 | |
| 
 | |
| - Reversibility test.
 | |
| 
 | |
| - Rename "Ta" as too unclear
 | |
| 
 | |
| - Lanczos
 | |
| 
 | |
| - Audit oIndex usage for cb behaviour
 | |
| 
 | |
| - Prepare multigrid for HMC. - Alternate setup schemes.
 | |
| 
 | |
| - Support for ILDG --- ugly, not done
 | |
| 
 | |
| - Flavour matrices?
 | |
| 
 | |
| - FFTnD ?
 | |
| 
 | |
| - Gparity; hand opt use template specialisation elegance to enable the optimised paths ?
 | |
| 
 | |
| - Gparity force term; Gparity (R)HMC.
 | |
| 
 | |
| - Mobius implementation clean up to rmove #if 0 stale code sequences
 | |
| 
 | |
| - CG -- profile carefully, kernel fusion, whole CG performance measurements.
 | |
| 
 | |
| ================================================================
 | |
| * Hacks and bug fixes to clean up and Audits
 | |
| ================================================================
 | |
| 
 | |
| *  Extract/merge/set cleanup ; too many variants; rationalise and call simpler ones
 | |
| 
 | |
| *  Rewrite core tensor arithmetic support to be more systematic
 | |
|  =  Use #define repetitive sequences to minimise code, decrease line count by thousands possible,
 | |
|     with more robust and maintainable implementation.
 | |
| 
 | |
| *  Ensure we ET as much as possible; move unop functions into ET framework.
 | |
|    - tests with expression args to all functions
 | |
| 
 | |
| * FIXME audit
 | |
| 
 | |
| * const audit
 | |
| 
 | |
| Insert/Extract
 | |
| * Replace vset with a call to merge.; 
 | |
| * care in Gmerge,Gextract over vset .
 | |
| * extract / merge extra implementation removal      
 | |
| * Optimise the extract/merge SIMD routines; Azusa??
 | |
|  - I have collated into single location at least.
 | |
|  - Need to use _mm_*insert/extract routines.
 | |
| 
 | |
| * Thread scaling tests Xeon, XeonPhi
 | |
| 
 | |
| Not sure of status of this -- reverify. Things are working nicely now though.
 | |
| 
 | |
| * Make the Tensor types and Complex etc... play more nicely.
 | |
| 
 | |
|   - TensorRemove is a hack, come up with a long term rationalised approach to Complex vs. Scalar<Scalar<Scalar<Complex > > >
 | |
|     QDP forces use of "toDouble" to get back to non tensor scalar. This role is presently taken TensorRemove, but I
 | |
|     want to introduce a syntax that does not require this.
 | |
| 
 | |
|   - Reductions that contract indices on a site should always demote the tensor structure.
 | |
|     norm2(), innerProduct.
 | |
| 
 | |
|   - Result of Sum(), SliceSum // spatial sums
 | |
|               trace, traceIndex etc.. do not.
 | |
| 
 | |
|   - problem arises because "trace" returns Lattice<TComplex> moving everything down to Scalar,
 | |
|     and then Sum and SliceSum to not remove the Scalars. This would be fixed if we 
 | |
|     template specialize the scalar scalar scalar sum and SliceSum,  on the basis of being
 | |
|     pure scalar.
 | |
| 
 | |
| ======================================================================
 | |
| ======================================================================
 | |
| ======================================================================
 | |
| ======================================================================
 | |
| RECENT
 | |
| ---------------
 | |
| 
 | |
|   - Support different fermion representations? -- DONE
 | |
|   - contained entirely within the integrator presently
 | |
|   - Clean up HMC                                                             -- DONE
 | |
|   - LorentzScalar<GaugeField> gets Gauge link type (cleaner).                -- DONE
 | |
|   - Simplified the integrators a bit.                                        -- DONE
 | |
|   - Multi-timescale looks broken and operating on single timescale for now.  -- DONE
 | |
|   - pass GaugeField as template param.                        -- DONE
 | |
|   - Reunitarise                                               -- DONE
 | |
|   - Force Gradient                                            -- DONE
 | |
|   - Prefer "RefreshInternal" or such like to "init" in naming -- DONE
 | |
|   - Parallel io improvements                                  -- DONE
 | |
|   - Plaquette and link trace checks into nersc reader from the Grid_nersc_io.cc test. -- DONE
 | |
| 
 | |
| 
 | |
| DONE:
 | |
| - MultiArray -- MultiRHS done
 | |
| - ConjugateGradientMultiShift -- DONE
 | |
| - MCR                         -- DONE
 | |
| - Remez -- Mike or Boost?     -- DONE
 | |
| - Proto (ET)                  -- DONE
 | |
| - uBlas                       -- DONE ; Eigen
 | |
| - Potentially Useful Boost libraries -- DONE ; Eigen
 | |
| - Aligned allocator; memory pool -- DONE
 | |
| - Multiprecision              -- DONE
 | |
| - Serialization               -- DONE
 | |
| - Regex -- Not needed
 | |
| - Tokenize -- Why?
 | |
| 
 | |
| - Random number state save restore -- DONE
 | |
| - Rectangle gauge actions. -- DONE
 | |
|   Iwasaki,
 | |
|   Symanzik,
 | |
|   ... etc...
 | |
| Done: Cayley, Partial , ContFrac force terms.
 | |
| 
 | |
| DONE
 | |
| - PseudoFermions
 | |
| => generalise to non-const EE ; likely defer (??) (NOT DONE)
 | |
| Done:
 | |
|   - TwoFlavour
 | |
|   - TwoFlavourEvenOdd        
 | |
|   - TwoFlavourRatio
 | |
|   - TwoFlavourRatioEvenOdd
 | |
| 
 | |
| Done:
 | |
|   - OneFlavourRationalEvenOdd
 | |
|   - OneFlavourRationalRatioEvenOdd
 | |
|   - OneFlavourRationalRatio
 | |
| 
 | |
| Done
 | |
| => Test DWF HMC
 | |
|    - Fix a threading bug that has been introduced and prevents HMC running hybrid OMP mode
 | |
| 
 | |
| Done:
 | |
| - RNG filling from sparser grid, lower dim grid.
 | |
| 
 | |
| 
 | |
| DONE
 | |
|   - MacroMagic -> virtual reader class.
 | |
| 
 | |
| *** Expression template engine: -- DONE
 | |
| 
 | |
| [   -- Norm2(expression) problem: introduce norm2 unary op, or Introduce conversion automatic from expression to Lattice<vobj>
 | |
| * Strong test for norm2, conj and all primitive types. -- tests/Grid_simd.cc is almost there
 | |
| * Implement where within expression template scheme.
 | |
| * Check for missing functionality                    - partially audited against QDP++ layout
 | |
| 
 | |
|    // Unary functions
 | |
|    // cos,sin, tan, acos, asin, cosh, acosh, tanh, sinh, // Scalar<vReal> only arg
 | |
|    // exp, log, sqrt, fabs
 | |
|    // transposeColor, transposeSpin,
 | |
|    // adjColor, adjSpin,
 | |
|    // copyMask.
 | |
|    // localMaxAbs
 | |
|    // Fourier transform equivalent.]
 | |
| 
 | |
| 
 | |
| * CovariantShift support                             -----Use a class to store gauge field? (parallel transport?)
 | |
| 
 | |
| -- coherent framework for implementing actions and their forces.
 | |
| Actions 
 | |
| 
 | |
| DONE
 | |
| * Fermion
 | |
|   - Wilson
 | |
|   - Clover
 | |
|   - DomainWall
 | |
|   - Mobius
 | |
|   - z-Mobius
 | |
| 
 | |
| 
 | |
| Algorithms (lots of reuse/port from BFM)
 | |
| * LinearOperator
 | |
| * LinearSolver
 | |
| * Polynomial 
 | |
| * Eigen
 | |
| * CG
 | |
| * Pcg
 | |
| * Adef2
 | |
| * DeflCG
 | |
| * fPcg
 | |
| * MCR
 | |
| * HDCG
 | |
| * HMC, 
 | |
| * Heatbath
 | |
| * Integrators, leapfrog, omelyan, force gradient etc...
 | |
| * etc..
 | |
| 
 | |
| Done
 | |
| * Pauli, SU subgroup, etc.. 
 | |
| 
 | |
| * su3 exponentiation & log etc.. [Jamie's code?]
 | |
| 
 | |
| ======================================================================================================
 | |
| FUNCTIONALITY: it pleases me to keep track of things I have done (keeps me arguably sane)
 | |
| ======================================================================================================
 | |
| 
 | |
| * Link smearing/boundary conds; Policy class based implementation ; framework more in place -- DONE
 | |
| * Command line args for geometry, simd, etc. layout. Is it necessary to have -- DONE
 | |
|   user pass these? Is this a QCD specific?
 | |
| 
 | |
| * Stencil -- DONE
 | |
| * Test infrastructure -- DONE
 | |
| * Fourspin, two spin project --- DONE
 | |
| * Dirac Gamma/Dirac structures ---- DONE
 | |
| * Conditional execution, where etc...                -----DONE, simple test
 | |
| * Integer relational support                         -----DONE
 | |
| * Coordinate information, integers etc...            -----DONE
 | |
| * Integer type padding/union to vector.              -----DONE 
 | |
| * LatticeCoordinate[mu]                              -----DONE
 | |
| * expose traceIndex, peekIndex, transposeIndex etc at the Lattice Level -- DONE
 | |
| * TraceColor, TraceSpin.                             ----- DONE (traceIndex<1>,traceIndex<2>, transposeIndex<1>,transposeIndex<2>)
 | |
|                                                      ----- Implement mapping between traceColour and traceSpin and traceIndex<1/2>.
 | |
| * How to do U[mu] ... lorentz part of type structure or not. more like chroma if not. -- DONE
 | |
| 
 | |
| * Twospin/Fourspin/Gamma/Proj/Recon                  ----- DONE
 | |
| * norm2l is a hack. figure out syntax error and make this norm2 c.f. tests/Grid_gamma.cc -- DONE
 | |
| 
 | |
| * subdirs lib, tests ??                              ----- DONE
 | |
|   - lib/math        
 | |
|   - lib/cartesian
 | |
|   - lib/cshift
 | |
|   - lib/stencil
 | |
|   - lib/communicator
 | |
|   - lib/algorithms
 | |
|   - lib/qcd
 | |
|   - lib/io/   -- GridLog, GridIn, GridErr, GridDebug, GridMessage
 | |
|   - lib/qcd/actions
 | |
|   - lib/qcd/measurements
 | |
| 
 | |
| * Subset support, slice sums etc...                  -----DONE
 | |
|   sliceSum(orthog)
 | |
|   sum
 | |
|   innerProduct
 | |
|   norm2
 | |
| 
 | |
| * Subgrid Transferral                                -----DONE
 | |
|   subBlock (coarseLattice,fineLattice)
 | |
|   projectBlockBasis  
 | |
|   promoteBlockBasis
 | |
| 
 | |
| * random number generation                           ----- DONE
 | |
| 
 | |
| * Broadcast, reduction tests. innerProduct, localInnerProduct --- DONE
 | |
| 
 | |
| * I/O support
 | |
| * NERSC Lattice loading, plaquette test             ------- DONE single node 
 | |
| 
 | |
| * Controling std::cout                              ------- DONE
 | |
| 
 | |
| * Had to hack assignment to 1.0 in the tests/Grid_gamma test -- DONE
 | |
| * Reduce implemention is poor ; need threaded reductions; OMP isn't able to do it for generic objects. -- DONE
 | |
| * Bug in RNG with complex numbers ; only filling real values; need helper function -- DONE
 | |
| * Conformable test in Cshift routines. -- none needed ; there is only one 
 | |
| * Conformable testing in expression templates -- DONE (recursive)
 | |
| 
 | |
| * Bug in SeedFixedIntegers gave same output on each site. -- DONE
 | |
|   Implement and use lattice IO to verify this.            -- cout for lattice types DONE
 | |
| 
 |