2018-09-24 15:09:51 +01:00
|
|
|
Interfacing with external software
|
|
|
|
========================================
|
|
|
|
|
|
|
|
Grid provides a number of important modules, such as solvers and
|
|
|
|
eigensolvers, that are highly optimized for complex vector/SIMD
|
|
|
|
architectures, such as the Intel Xeon Phi KNL and Skylake processors.
|
|
|
|
This growing library, with appropriate interfacing, can be accessed
|
|
|
|
from existing code. Here we describe interfacing issues and provide
|
|
|
|
examples.
|
|
|
|
|
|
|
|
|
|
|
|
MPI initialization
|
2018-09-24 22:10:30 +01:00
|
|
|
--------------------
|
2018-09-24 15:09:51 +01:00
|
|
|
|
|
|
|
Grid supports threaded MPI sends and receives and, if running with
|
|
|
|
more than one thread, requires the MPI_THREAD_MULTIPLE mode of message
|
|
|
|
passing. If the user initializes MPI before starting Grid, the
|
|
|
|
appropriate initialization call is::
|
|
|
|
|
|
|
|
MPI_Init_thread(argc, argv, MPI_THREAD_MULTIPLE, &provided);
|
|
|
|
assert(MPI_THREAD_MULTIPLE == provided);
|
|
|
|
|
|
|
|
Grid Initialization
|
2018-09-24 22:10:30 +01:00
|
|
|
---------------------
|
2018-09-24 15:09:51 +01:00
|
|
|
|
|
|
|
Grid itself is initialized with a call::
|
|
|
|
|
|
|
|
Grid_init(&argc, &argv);
|
|
|
|
|
|
|
|
.. todo:: CD: Where are the command-line arguments explained above?
|
|
|
|
|
|
|
|
where `argc` and `argv` are constructed to simulate the command-line
|
|
|
|
options described above. At a minimum one must provide the `--grid`
|
|
|
|
and `--mpi` parameters. The latter specifies the grid of processors
|
|
|
|
(MPI ranks).
|
|
|
|
|
|
|
|
The following Grid procedures are useful for verifying that Grid is
|
|
|
|
properly initialized.
|
|
|
|
|
|
|
|
============================================================= ===========================================================================================================
|
2018-09-24 22:10:30 +01:00
|
|
|
Grid procedure returns
|
2018-09-24 15:09:51 +01:00
|
|
|
============================================================= ===========================================================================================================
|
2018-09-24 22:10:30 +01:00
|
|
|
std::vector<int> GridDefaultLatt(); lattice size
|
|
|
|
std::vector<int> GridDefaultSimd(int Nd,vComplex::Nsimd()); SIMD layout
|
|
|
|
std::vector<int> GridDefaultMpi(); MPI layout
|
|
|
|
int Grid::GridThread::GetThreads(); number of threads
|
|
|
|
============================================================= ===========================================================================================================
|
|
|
|
|
2018-09-24 15:09:51 +01:00
|
|
|
|
|
|
|
MPI coordination
|
|
|
|
----------------
|
|
|
|
|
|
|
|
Grid wants to use its own numbering of MPI ranks and its own
|
|
|
|
assignment of the lattice coordinates with each rank. Obviously, the
|
|
|
|
calling program and Grid must agree on these conventions. It is
|
|
|
|
convenient to use Grid's Cartesian communicator class to discover the
|
|
|
|
processor assignments. For a four-dimensional processor grid one can
|
|
|
|
define::
|
|
|
|
|
|
|
|
static Grid::CartesianCommunicator *grid_cart = NULL;
|
|
|
|
grid_cart = new Grid::CartesianCommunicator(processors);
|
|
|
|
|
|
|
|
where `processors` is of type `std::vector<int>`, with values matching
|
|
|
|
the MPI processor-layout dimensions specified with the `--mpi`
|
|
|
|
argument in the `Grid_Init` call. Then each MPI rank can obtain its
|
|
|
|
processor coordinate using the Cartesian communicator instantiated
|
|
|
|
above. For example, in four dimensions::
|
|
|
|
|
|
|
|
std::vector<int> pePos(4);
|
|
|
|
for(int i=0; i<4; i++)
|
|
|
|
pePos[i] = grid_cart->_processor_coor[i];
|
|
|
|
|
|
|
|
and each MPI process can get its world rank from its processor
|
|
|
|
coordinates using::
|
|
|
|
|
|
|
|
int peRank = grid_cart->RankFromProcessorCoor(pePos)
|
|
|
|
|
|
|
|
Conversely, each MPI process can get its processor coordinates from
|
|
|
|
its world rank using::
|
|
|
|
|
|
|
|
grid_cart->ProcessorCoorFromRank(peRank, pePos);
|
|
|
|
|
|
|
|
If the calling program initialized MPI before initializing Grid, it is
|
|
|
|
then important for each MPI process in the calling program to reset
|
|
|
|
its rank number so it agrees with Grid::
|
|
|
|
|
|
|
|
MPI_Comm comm;
|
|
|
|
MPI_Comm_split(MPI_COMM_THISJOB,jobid,peRank,&comm);
|
|
|
|
MPI_COMM_THISJOB = comm;
|
|
|
|
|
|
|
|
where `MPI_COMM_THISJOB` is initially a copy of `MPI_COMM_WORLD` (with
|
|
|
|
`jobid = 0`), or it is a split communicator with `jobid` equal to the
|
|
|
|
index number of the subcommunicator. Once this is done,::
|
|
|
|
|
|
|
|
MPI_Comm_rank(MPI_COMM_THISJOB, &myrank);
|
|
|
|
|
|
|
|
returns a rank that agrees with Grid's `peRank`.
|
|
|
|
|
|
|
|
|
|
|
|
Mapping fields between Grid and user layouts
|
2018-09-24 22:10:30 +01:00
|
|
|
---------------------------------------------
|
2018-09-24 15:09:51 +01:00
|
|
|
|
|
|
|
In order to map data between layouts, it is important to know
|
|
|
|
how the lattice sites are distributed across the processor grid. A
|
|
|
|
lattice site with coordinates `r[mu]` is assigned to the processor with
|
|
|
|
processor coordinates `pePos[mu]` according to the rule::
|
|
|
|
|
|
|
|
pePos[mu] = r[mu]/dim[mu]
|
|
|
|
|
|
|
|
where `dim[mu]` is the lattice dimension in the `mu` direction. For
|
|
|
|
performance reasons, it is important that the external data layout
|
|
|
|
follow the same rule. Then data mapping can be done without
|
|
|
|
requiring costly communication between ranks. We assume this is the
|
|
|
|
case here.
|
|
|
|
|
|
|
|
When mapping data to and from Grid, one must choose a lattice object
|
|
|
|
defined on the appropriate grid, whether it be a full lattice (4D
|
|
|
|
`GridCartesian`), one of the checkerboards (4D
|
|
|
|
`GridRedBlackCartesian`), a five-dimensional full grid (5D
|
|
|
|
`GridCartesian`), or a five-dimensional checkerboard (5D
|
|
|
|
`GridRedBlackCartesian`). For example, an improved staggered fermion
|
|
|
|
color-vector field `cv` on a single checkerboard would be constructed
|
|
|
|
using
|
|
|
|
|
|
|
|
**Example**::
|
|
|
|
|
|
|
|
std::vector<int> latt_size = GridDefaultLatt();
|
|
|
|
std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd());
|
|
|
|
std::vector<int> mpi_layout = GridDefaultMpi();
|
|
|
|
|
|
|
|
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
|
|
|
|
GridRedBlackCartesian RBGrid(&Grid);
|
|
|
|
|
|
|
|
typename ImprovedStaggeredFermion::FermionField cv(RBGrid);
|
|
|
|
|
|
|
|
To map data within an MPI rank, the external code must iterate over
|
|
|
|
the sites belonging to that rank (full or checkerboard as
|
|
|
|
appropriate). To import data into Grid, the external data on a single
|
|
|
|
site with coordinates `r` is first copied into the appropriate Grid
|
|
|
|
scalar object `s`. Then it is copied into the Grid lattice field `l`
|
|
|
|
with `pokeLocalSite`::
|
|
|
|
|
|
|
|
pokeLocalSite(const sobj &s, Lattice<vobj> &l, Coordinate &r);
|
|
|
|
|
|
|
|
To export data from Grid, the reverse operation starts with::
|
|
|
|
|
|
|
|
peekLocalSite(const sobj &s, Lattice<vobj> &l, Coordinate &r);
|
|
|
|
|
|
|
|
and then copies the single-site data from `s` into the corresponding
|
|
|
|
external type.
|
|
|
|
|
|
|
|
Here is an example that maps a single site's worth of data in a MILC
|
|
|
|
color-vector field to a Grid scalar ColourVector object `cVec` and from
|
|
|
|
there to the lattice colour-vector field `cv`, as defined above.
|
|
|
|
|
|
|
|
**Example**::
|
|
|
|
|
|
|
|
indexToCoords(idx,r);
|
|
|
|
ColourVector cVec;
|
|
|
|
for(int col=0; col<Nc; col++)
|
|
|
|
cVec._internal._internal._internal[col] =
|
|
|
|
Complex(src[idx].c[col].real, src[idx].c[col].imag);
|
|
|
|
|
|
|
|
pokeLocalSite(cVec, cv, r);
|
|
|
|
|
|
|
|
Here the `indexToCoords()` function is a MILC mapping of the MILC site
|
|
|
|
index `idx` to the 4D lattice coordinate `r`.
|
|
|
|
|
|
|
|
Grid provides block- and multiple-rhs conjugate-gradient solvers. For
|
|
|
|
this purpose it uses a 5D lattice. To map data to and from Grid data
|
|
|
|
types, the index for the right-hand-side vector becomes the zeroth
|
|
|
|
coordinate of a five-dimensional vector `r5`. The remaining
|
|
|
|
components of `r5` contain the 4D space-time coordinates. The
|
|
|
|
`pokeLocalSite/peekLocalSite` operations then accept the coordinate
|
|
|
|
`r5`, provided the destination/source lattice object is also 5D. In
|
|
|
|
the example below data from a single site specified by `idx`,
|
|
|
|
belonging to a set of `Ls` MILC color-vector fields, are copied into a
|
|
|
|
Grid 5D fermion field `cv5`.
|
|
|
|
|
|
|
|
**Example**::
|
|
|
|
|
2018-09-24 22:10:30 +01:00
|
|
|
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt();
|
|
|
|
GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid) typename ImprovedStaggeredFermion5D::FermionField cv5(FrbGrid);
|
2018-09-24 15:09:51 +01:00
|
|
|
|
2018-09-24 22:10:30 +01:00
|
|
|
std::vector<int> r(4);
|
|
|
|
indexToCoords(idx,r);
|
|
|
|
std::vector<int> r5(1,0);
|
|
|
|
for( int d = 0; d < 4; d++ ) r5.push_back(r[d]);
|
2018-09-24 15:09:51 +01:00
|
|
|
|
2018-09-24 22:10:30 +01:00
|
|
|
for( int j = 0; j < Ls; j++ ){
|
2018-09-24 15:09:51 +01:00
|
|
|
r5[0] = j;
|
|
|
|
ColourVector cVec;
|
|
|
|
for(int col=0; col<Nc; col++){
|
|
|
|
cVec._internal._internal._internal[col] =
|
|
|
|
Complex(src[j][idx].c[col].real, src[j][idx].c[col].imag);
|
|
|
|
}
|
|
|
|
pokeLocalSite(cVec, *(out->cv), r5);
|
2018-09-24 22:10:30 +01:00
|
|
|
}
|
2018-09-24 15:09:51 +01:00
|
|
|
|