Merge commit 'beed527ea37c90fd5e19b82d326eb8adc8eba5ff' into develop

2025-12-16 02:34:41 +00:00 · 2018-09-29 15:50:21 +01:00
parent f411657118 beed527ea3
commit e881a0c157
1 changed files with 197 additions and 0 deletions
--- a/documentation/interfacing.rst
+++ b/documentation/interfacing.rst
@@ -0,0 +1,197 @@
 Interfacing with external software
 ========================================
 Grid provides a number of important modules, such as solvers and
 eigensolvers, that are highly optimized for complex vector/SIMD
 architectures, such as the Intel Xeon Phi KNL and Skylake processors.
 This growing library, with appropriate interfacing, can be accessed
 from existing code. Here we describe interfacing issues and provide
 examples.
 MPI initialization
 ------------------
 Grid supports threaded MPI sends and receives and, if running with
 more than one thread, requires the MPI_THREAD_MULTIPLE mode of message
 passing. If the user initializes MPI before starting Grid, the
 appropriate initialization call is::
  MPI_Init_thread(argc, argv, MPI_THREAD_MULTIPLE, &provided);
  assert(MPI_THREAD_MULTIPLE == provided);
 Grid Initialization
 -------------------
 Grid itself is initialized with a call::
  Grid_init(&argc, &argv);
 .. todo:: CD: Where are the command-line arguments explained above?
 where `argc` and `argv` are constructed to simulate the command-line
 options described above.  At a minimum one must provide the `--grid`
 and `--mpi` parameters.  The latter specifies the grid of processors
 (MPI ranks).
 The following Grid procedures are useful for verifying that Grid is
 properly initialized.
 =============================================================   ===========================================================================================================
  Grid procedure                                                returns 
 =============================================================   ===========================================================================================================
  std::vector<int> GridDefaultLatt();                           lattice size
  std::vector<int> GridDefaultSimd(int Nd,vComplex::Nsimd());   SIMD layout
  std::vector<int> GridDefaultMpi();                            MPI layout
  int Grid::GridThread::GetThreads();                           number of threads
 MPI coordination
 ----------------
 Grid wants to use its own numbering of MPI ranks and its own
 assignment of the lattice coordinates with each rank.  Obviously, the
 calling program and Grid must agree on these conventions.  It is
 convenient to use Grid's Cartesian communicator class to discover the
 processor assignments. For a four-dimensional processor grid one can
 define::
  static Grid::CartesianCommunicator *grid_cart = NULL;
  grid_cart = new Grid::CartesianCommunicator(processors);
 where `processors` is of type `std::vector<int>`, with values matching
 the MPI processor-layout dimensions specified with the `--mpi`
 argument in the `Grid_Init` call.  Then each MPI rank can obtain its
 processor coordinate using the Cartesian communicator instantiated
 above.  For example, in four dimensions::
  std::vector<int> pePos(4);    
  for(int i=0; i<4; i++)
     pePos[i] = grid_cart->_processor_coor[i];
 and each MPI process can get its world rank from its processor
 coordinates using::
  int peRank = grid_cart->RankFromProcessorCoor(pePos)
 Conversely, each MPI process can get its processor coordinates from
 its world rank using::
  grid_cart->ProcessorCoorFromRank(peRank, pePos);
 If the calling program initialized MPI before initializing Grid, it is
 then important for each MPI process in the calling program to reset
 its rank number so it agrees with Grid::
   MPI_Comm comm;
   MPI_Comm_split(MPI_COMM_THISJOB,jobid,peRank,&comm);
   MPI_COMM_THISJOB = comm;
 where `MPI_COMM_THISJOB` is initially a copy of `MPI_COMM_WORLD` (with
 `jobid = 0`), or it is a split communicator with `jobid` equal to the
 index number of the subcommunicator.  Once this is done,::
  MPI_Comm_rank(MPI_COMM_THISJOB, &myrank);
 returns a rank that agrees with Grid's `peRank`.
 Mapping fields between Grid and user layouts
 -------------------------------------------
 In order to map data between layouts, it is important to know
 how the lattice sites are distributed across the processor grid.  A
 lattice site with coordinates `r[mu]` is assigned to the processor with
 processor coordinates `pePos[mu]` according to the rule::
  pePos[mu] = r[mu]/dim[mu]
 where `dim[mu]` is the lattice dimension in the `mu` direction.  For
 performance reasons, it is important that the external data layout
 follow the same rule.  Then data mapping can be done without
 requiring costly communication between ranks.  We assume this is the
 case here.
 When mapping data to and from Grid, one must choose a lattice object
 defined on the appropriate grid, whether it be a full lattice (4D
 `GridCartesian`), one of the checkerboards (4D
 `GridRedBlackCartesian`), a five-dimensional full grid (5D
 `GridCartesian`), or a five-dimensional checkerboard (5D
 `GridRedBlackCartesian`).  For example, an improved staggered fermion
 color-vector field `cv` on a single checkerboard would be constructed
 using
 **Example**::
  std::vector<int> latt_size   = GridDefaultLatt();
  std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd());
  std::vector<int> mpi_layout  = GridDefaultMpi();
  GridCartesian               Grid(latt_size,simd_layout,mpi_layout);
  GridRedBlackCartesian       RBGrid(&Grid);
  typename ImprovedStaggeredFermion::FermionField  cv(RBGrid);
 To map data within an MPI rank, the external code must iterate over
 the sites belonging to that rank (full or checkerboard as
 appropriate).  To import data into Grid, the external data on a single
 site with coordinates `r` is first copied into the appropriate Grid
 scalar object `s`.  Then it is copied into the Grid lattice field `l`
 with `pokeLocalSite`::
  pokeLocalSite(const sobj &s, Lattice<vobj> &l, Coordinate &r);
 To export data from Grid, the reverse operation starts with::
  peekLocalSite(const sobj &s, Lattice<vobj> &l, Coordinate &r);
 and then copies the single-site data from `s` into the corresponding
 external type.
 Here is an example that maps a single site's worth of data in a MILC
 color-vector field to a Grid scalar ColourVector object `cVec` and from
 there to the lattice colour-vector field `cv`, as defined above.
 **Example**::
  indexToCoords(idx,r);
  ColourVector cVec;
  for(int col=0; col<Nc; col++)
      cVec._internal._internal._internal[col] = 
          Complex(src[idx].c[col].real, src[idx].c[col].imag);
  pokeLocalSite(cVec, cv, r);
 Here the `indexToCoords()` function is a MILC mapping of the MILC site
 index `idx` to the 4D lattice coordinate `r`.
 Grid provides block- and multiple-rhs conjugate-gradient solvers. For
 this purpose it uses a 5D lattice. To map data to and from Grid data
 types, the index for the right-hand-side vector becomes the zeroth
 coordinate of a five-dimensional vector `r5`.  The remaining
 components of `r5` contain the 4D space-time coordinates.  The
 `pokeLocalSite/peekLocalSite` operations then accept the coordinate
 `r5`, provided the destination/source lattice object is also 5D.  In
 the example below data from a single site specified by `idx`,
 belonging to a set of `Ls` MILC color-vector fields, are copied into a
 Grid 5D fermion field `cv5`.
 **Example**::
  GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt();
  GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid)  typename ImprovedStaggeredFermion5D::FermionField  cv5(FrbGrid);
  std::vector<int> r(4);
  indexToCoords(idx,r);
  std::vector<int> r5(1,0);
  for( int d = 0; d < 4; d++ ) r5.push_back(r[d]);
  for( int j = 0; j < Ls; j++ ){
      r5[0] = j;
      ColourVector cVec;
      for(int col=0; col<Nc; col++){
 	  cVec._internal._internal._internal[col] = 
 	      Complex(src[j][idx].c[col].real, src[j][idx].c[col].imag);
      }
      pokeLocalSite(cVec, *(out->cv), r5);
  }