Grid/TODO


* FIXME audit

* Replace vset with a call to merge.; 

* care in Gmerge,Gextract over vset .

* Const audit

* extract / merge extra implementation removal      


* Conditional execution, where etc...                -----DONE, simple test

* Integer relational support                         -----DONE

* Coordinate information, integers etc...            -----DONE

* Integer type padding/union to vector.              -----DONE 

* LatticeCoordinate[mu]                              -----DONE

* Stencil operator support                           -----Initial thoughts, trial implementation DONE.
                                                     -----some simple tests that Stencil matches Cshift.

* Subset support, slice sums etc...                  -----Only need slice sum?
                                                     -----Generic cartesian subslicing?
                                                     -----Array ranges / boost extents?
                                                     -----Multigrid grid transferral?
                                                     -----Suggests generalised cartesian subblocking
                                                          sums, returning modified grid?
					             -----What should interface be?

i)  Two classes of subset;   red black parity subsetting (pick checkerboard).
                             cartesian sub-block subsetting


ii) Need to be able to project one Grid to another Grid.

Interface: (?)

Lattice<vobj> coarse_data SubBlockSum (GridBase *CoarseGrid, Lattice<vobj> &fine_data)

Operation ensure either:
 rd[dim] divide rd[dim] fine_data

This will give a distributed array over mpi ranks in a given dim IF coarse gd != 1 and _processors[d]>1
Dimension can be *replicated* on all ranks in dimension. Need a "replicated" option on GridCartesian etc..

This will give "slice" summation and fourier projection assistance.

    Generic concept is to subdivide (based on RD so applies to red/black or full).
    Return a type on SUB-grid from CellSum TOP-grid
    SUB-grid need not distribute but be replicated in some dims if that is how the
    cartesian communicator works.

Instead of subsetting 

iii) No general permutation map.

* Consider switch std::vector to boost arrays.
  boost::multi_array<type, 3> A()...    to replace multi1d, multi2d etc..

*? Cell definition <-> sliceSum.
 ? Replicated arrays.

* Check for missing functionality                    - partially audited against QDP++ layout

* Optimise the extract/merge SIMD routines; Azusa??

 - I have collated into single location at least.
 - Need to use _mm_*insert/extract routines.

* Conformable test in Cshift routines.

* Gamma/Dirac structures

* Fourspin, two spin project

* Broadcast, reduction tests. innerProduct, localInnerProduct

* QDP++ regression suite and comparative benchmark

* NERSC Lattice loading, plaquette test

* I/O support
  - MPI IO?
  - BinaryWriter, TextWriter etc...
  - protocol buffers?

// Cartesian grid inheritance
//            Grid::GridBase
//                     |
//           __________|___________
//          |                      |
// Grid::GridCartesian   Grid::GridCartesianRedBlack
//
// TODO: document the following as an API guaranteed public interface

    /* 
     *       Rough map of functionality against QDP++ Layout
     *
     *       Param     |     Grid                     |     QDP++             
     *       -----------------------------------------
     *                 |                              |
     *        void     |     oSites, iSites, lSites   |  sitesOnNode 
     *        void     |     gSites                   |  vol
     *                 |                              |
     *        gcoor    |     oIndex, iIndex           |  linearSiteIndex // no virtual node in QDP
     *        lcoor    |                              |
     * 
     *        void     |     CheckerBoarded           |  -        // No checkerboarded in QDP
     *        void     |     FullDimensions           |  lattSize
     *        void     |     GlobalDimensions         |  lattSize // No checkerboarded in QDP
     *        void     |     LocalDimensions          |  subgridLattSize
     *        void     |     VirtualLocalDimensions   |  subgridLattSize // no virtual node in QDP
     *                 |                              |
     *       int x 3   |     oiSiteRankToGlobal       |  siteCoords
     *                 |     ProcessorCoorLocalCoorToGlobalCoor | 
     *                 |                              |
     *     vector<int> |     GlobalCoorToRankIndex   |  nodeNumber(coord)
     *     vector<int> |     GlobalCoorToProcessorCoorLocalCoor|  nodeCoord(coord)
     *                 |                              |
     *     void        |     Processors               |  logicalSize    // returns cart array shape
     *     void        |     ThisRank        |  nodeNumber();  // returns this node rank
     *     void        |     ThisProcessorCoor        |    // returns this node coor
     *     void        |     isBoss(void)             |  primaryNode();
     *                 |                              |
     *                 |     RankFromProcessorCoor    |  getLogicalCoorFrom(node)
     *                 |     ProcessorCoorFromRank    |  getNodeNumberFrom(logical_coord)
     */
  // Work out whether to permute 
  // ABCDEFGH ->   AE BF CG DH       permute              wrap num
  //
  // Shift 0       AE BF CG DH       0 0 0 0    ABCDEFGH   0   0
  // Shift 1       BF CG DH AE       0 0 0 1    BCDEFGHA   0   1
  // Shift 2       CG DH AE BF       0 0 1 1    CDEFGHAB   0   2
  // Shift 3       DH AE BF CG       0 1 1 1    DEFGHABC   0   3
  // Shift 4       AE BF CG DH       1 1 1 1    EFGHABCD   1   0 
  // Shift 5       BF CG DH AE       1 1 1 0    FGHACBDE   1   1 
  // Shift 6       CG DH AE BF       1 1 0 0    GHABCDEF   1   2
  // Shift 7       DH AE BF CG       1 0 0 0    HABCDEFG   1   3

  // Suppose 4way simd in one dim.
  // ABCDEFGH ->   AECG BFDH      permute              wrap num

  // Shift 0       AECG BFDH      0,00 0,00 ABCDEFGH         0     0
  // Shift 1       BFDH CGEA      0,00 1,01 BCDEFGHA         0     1
  // Shift 2       CGEA DHFB      1,01 1,01 CDEFGHAB         1     0
  // Shift 3       DHFB EAGC      1,01 1,11 DEFGHABC         1     1
  // Shift 4       EAGC FBHD      1,11 1,11 EFGHABCD         2     0 
  // Shift 5       FBHD GCAE      1,11 1,10 FGHABCDE         2     1
  // Shift 6       GCAE HDBF      1,10 1,10 GHABCDEF         3     0
  // Shift 7       HDBF AECG      1,10 0,00 HABCDEFG         3     1

  // Generalisation to 8 way simd, 16 way simd required.
  //
  // Need log2 Nway masks. consisting of 
  //	    1 bit  256 bit granule
  //	    2 bit  128 bit granule
  //        4 bits 64  bit granule
  //        8 bits 32  bit granules
  //
  //        15 bits....
    // TODO
    //
    // Base class to share common code between vRealF, VComplexF etc...
    //
    // lattice Broad cast assignment
    //
    // where() support
    // implement with masks, and/or? Type of the mask & boolean support?
    //
    // Unary functions
    // cos,sin, tan, acos, asin, cosh, acosh, tanh, sinh, // Scalar<vReal> only arg
    // exp, log, sqrt, fabs
    //
    // transposeColor, transposeSpin,
    // adjColor, adjSpin,
    // traceColor, traceSpin.
    // peekColor, peekSpin + pokeColor PokeSpin
    //
    // copyMask.
    //
    // localMaxAbs
    //
    // norm2,
    // sumMulti equivalent.
    // Fourier transform equivalent.
    //
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
Bringing in LatticeInteger with the idea of implemented predicated assignment, subsets etc. c.f the QDP++ "where" syntax 2015-04-06 06:30:48 +01:00			`* FIXME audit`
Modified 2015-04-14 20:25:51 +01:00
Fixing nocompile 2015-04-10 04:24:01 +01:00			`* Replace vset with a call to merge.;`
Modified 2015-04-14 20:25:51 +01:00
			`* care in Gmerge,Gextract over vset .`

"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`* Const audit`
Modified 2015-04-14 20:25:51 +01:00
Stencil code pretty much shaken out. Beginning of inner product and norm2. 2015-04-14 20:22:04 +01:00			`* extract / merge extra implementation removal`
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00
Fixing nocompile 2015-04-10 04:24:01 +01:00
			`* Conditional execution, where etc... -----DONE, simple test`
Modified 2015-04-14 20:25:51 +01:00
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`* Integer relational support -----DONE`
Modified 2015-04-14 20:25:51 +01:00
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`* Coordinate information, integers etc... -----DONE`
Modified 2015-04-14 20:25:51 +01:00
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`* Integer type padding/union to vector. -----DONE`
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00
Modified 2015-04-14 20:25:51 +01:00			`* LatticeCoordinate[mu] -----DONE`
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00
Modified 2015-04-14 20:25:51 +01:00			`* Stencil operator support -----Initial thoughts, trial implementation DONE.`
			`-----some simple tests that Stencil matches Cshift.`
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
			`* Subset support, slice sums etc... -----Only need slice sum?`
			`-----Generic cartesian subslicing?`
			`-----Array ranges / boost extents?`
Modified 2015-04-14 20:25:51 +01:00			`-----Multigrid grid transferral?`
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`-----Suggests generalised cartesian subblocking`
Modified 2015-04-14 20:25:51 +01:00			`sums, returning modified grid?`
			`-----What should interface be?`
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
Fixing nocompile 2015-04-10 04:24:01 +01:00			`i) Two classes of subset; red black parity subsetting (pick checkerboard).`
			`cartesian sub-block subsetting`

"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
			`ii) Need to be able to project one Grid to another Grid.`
Fixing nocompile 2015-04-10 04:24:01 +01:00
			`Interface: (?)`

			`Lattice<vobj> coarse_data SubBlockSum (GridBase *CoarseGrid, Lattice<vobj> &fine_data)`

			`Operation ensure either:`
			`rd[dim] divide rd[dim] fine_data`

			`This will give a distributed array over mpi ranks in a given dim IF coarse gd != 1 and _processors[d]>1`
			`Dimension can be replicated on all ranks in dimension. Need a "replicated" option on GridCartesian etc..`

			`This will give "slice" summation and fourier projection assistance.`

"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`Generic concept is to subdivide (based on RD so applies to red/black or full).`
			`Return a type on SUB-grid from CellSum TOP-grid`
			`SUB-grid need not distribute but be replicated in some dims if that is how the`
			`cartesian communicator works.`

Fixing nocompile 2015-04-10 04:24:01 +01:00			`Instead of subsetting`
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
Fixing nocompile 2015-04-10 04:24:01 +01:00			`iii) No general permutation map.`
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
			`* Consider switch std::vector to boost arrays.`
			`boost::multi_array<type, 3> A()... to replace multi1d, multi2d etc..`

			`*? Cell definition <-> sliceSum.`
			`? Replicated arrays.`
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`* Check for missing functionality - partially audited against QDP++ layout`

			`* Optimise the extract/merge SIMD routines; Azusa??`

Fixing nocompile 2015-04-10 04:24:01 +01:00			`- I have collated into single location at least.`
			`- Need to use _mm_*insert/extract routines.`
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00
			`* Conformable test in Cshift routines.`

			`* Gamma/Dirac structures`
"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00			`* Fourspin, two spin project`

"where" and integer comparisons logic implemented for conditional assignment. LatticeCoordinate helper to get global (reduced) coordinate. Some more work of similar type perhaps needed, but the bulk of the required structure for masked array assignment is now in place. 2015-04-09 07:06:03 +01:00			`* Broadcast, reduction tests. innerProduct, localInnerProduct`

			`* QDP++ regression suite and comparative benchmark`

			`* NERSC Lattice loading, plaquette test`
TODO list for preparing this for real use and QDP++ replacement. 2015-04-03 09:28:58 +01:00
			`* I/O support`
			`- MPI IO?`
			`- BinaryWriter, TextWriter etc...`
			`- protocol buffers?`
Modified 2015-04-14 20:25:51 +01:00
Major rework of extract/merge/permute processing debugged and working. 2015-04-06 11:26:24 +01:00			`// Cartesian grid inheritance`
			`// Grid::GridBase`
			`// \|`
			`// __________\|___________`
			`// \| \|`
			`// Grid::GridCartesian Grid::GridCartesianRedBlack`
			`//`
			`// TODO: document the following as an API guaranteed public interface`

			`/*`
			`* Rough map of functionality against QDP++ Layout`
			`*`
			`* Param \| Grid \| QDP++`
			`* -----------------------------------------`
			`* \| \|`
			`* void \| oSites, iSites, lSites \| sitesOnNode`
			`* void \| gSites \| vol`
			`* \| \|`
			`* gcoor \| oIndex, iIndex \| linearSiteIndex // no virtual node in QDP`
			`* lcoor \| \|`
			`*`
			`* void \| CheckerBoarded \| - // No checkerboarded in QDP`
			`* void \| FullDimensions \| lattSize`
			`* void \| GlobalDimensions \| lattSize // No checkerboarded in QDP`
			`* void \| LocalDimensions \| subgridLattSize`
			`* void \| VirtualLocalDimensions \| subgridLattSize // no virtual node in QDP`
			`* \| \|`
			`* int x 3 \| oiSiteRankToGlobal \| siteCoords`
			`* \| ProcessorCoorLocalCoorToGlobalCoor \|`
			`* \| \|`
			`* vector<int> \| GlobalCoorToRankIndex \| nodeNumber(coord)`
			`* vector<int> \| GlobalCoorToProcessorCoorLocalCoor\| nodeCoord(coord)`
			`* \| \|`
			`* void \| Processors \| logicalSize // returns cart array shape`
			`* void \| ThisRank \| nodeNumber(); // returns this node rank`
			`* void \| ThisProcessorCoor \| // returns this node coor`
			`* void \| isBoss(void) \| primaryNode();`
			`* \| \|`
			`* \| RankFromProcessorCoor \| getLogicalCoorFrom(node)`
			`* \| ProcessorCoorFromRank \| getNodeNumberFrom(logical_coord)`
			`*/`
			`// Work out whether to permute`
			`// ABCDEFGH -> AE BF CG DH permute wrap num`
			`//`
			`// Shift 0 AE BF CG DH 0 0 0 0 ABCDEFGH 0 0`
			`// Shift 1 BF CG DH AE 0 0 0 1 BCDEFGHA 0 1`
			`// Shift 2 CG DH AE BF 0 0 1 1 CDEFGHAB 0 2`
			`// Shift 3 DH AE BF CG 0 1 1 1 DEFGHABC 0 3`
			`// Shift 4 AE BF CG DH 1 1 1 1 EFGHABCD 1 0`
			`// Shift 5 BF CG DH AE 1 1 1 0 FGHACBDE 1 1`
			`// Shift 6 CG DH AE BF 1 1 0 0 GHABCDEF 1 2`
			`// Shift 7 DH AE BF CG 1 0 0 0 HABCDEFG 1 3`

			`// Suppose 4way simd in one dim.`
			`// ABCDEFGH -> AECG BFDH permute wrap num`

			`// Shift 0 AECG BFDH 0,00 0,00 ABCDEFGH 0 0`
			`// Shift 1 BFDH CGEA 0,00 1,01 BCDEFGHA 0 1`
			`// Shift 2 CGEA DHFB 1,01 1,01 CDEFGHAB 1 0`
			`// Shift 3 DHFB EAGC 1,01 1,11 DEFGHABC 1 1`
			`// Shift 4 EAGC FBHD 1,11 1,11 EFGHABCD 2 0`
			`// Shift 5 FBHD GCAE 1,11 1,10 FGHABCDE 2 1`
			`// Shift 6 GCAE HDBF 1,10 1,10 GHABCDEF 3 0`
			`// Shift 7 HDBF AECG 1,10 0,00 HABCDEFG 3 1`

			`// Generalisation to 8 way simd, 16 way simd required.`
			`//`
			`// Need log2 Nway masks. consisting of`
			`// 1 bit 256 bit granule`
			`// 2 bit 128 bit granule`
			`// 4 bits 64 bit granule`
			`// 8 bits 32 bit granules`
			`//`
			`// 15 bits....`
			`// TODO`
			`//`
			`// Base class to share common code between vRealF, VComplexF etc...`
			`//`
			`// lattice Broad cast assignment`
			`//`
			`// where() support`
			`// implement with masks, and/or? Type of the mask & boolean support?`
			`//`
			`// Unary functions`
			`// cos,sin, tan, acos, asin, cosh, acosh, tanh, sinh, // Scalar<vReal> only arg`
			`// exp, log, sqrt, fabs`
			`//`
			`// transposeColor, transposeSpin,`
			`// adjColor, adjSpin,`
			`// traceColor, traceSpin.`
			`// peekColor, peekSpin + pokeColor PokeSpin`
			`//`
			`// copyMask.`
			`//`
			`// localMaxAbs`
			`//`
			`// norm2,`
			`// sumMulti equivalent.`
			`// Fourier transform equivalent.`
			`//`