mirror of
https://github.com/paboyle/Grid.git
synced 2026-06-04 19:24:36 +01:00
5822a6599c
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1.5 KiB
1.5 KiB
name, description, metadata
| name | description | metadata | ||||||
|---|---|---|---|---|---|---|---|---|
| ref_lattice_vs_vector | When to use Lattice<T> vs std::vector<T> for GPU-portable field storage in Grid |
|
Rule
Use Lattice<vobj> (or std::vector<Lattice<vobj>>) for any field that will be read or written inside accelerator_for. std::vector<vobj> is host memory and is NOT device-accessible.
Before vs after GPU offload
// CPU-only (host memory, not GPU accessible)
std::vector<SpinColourVector_v> tloopv(oSites, Zero());
// accessed directly: tloopv[ss]
// GPU-portable
Lattice<SpinColourVector_v> tloop(grid);
// accessed via view: autoView(tloop_v, tloop, AcceleratorWrite);
// coalescedWrite(tloop_v[ss], val);
Corollary: function signatures
CPU-only version:
void PackLeft(const std::vector<std::vector<vobj>> &leftv);
GPU-portable version:
void PackLeft(const std::vector<Lattice<vobj>> &leftv);
deviceVector for raw device buffers
deviceVector<T> (defined in Grid) is like std::vector<T> but in device-accessible memory. Use for raw scalar scratch/pack buffers (e.g. GEMM input/output staging). Not for structured lattice data.
Pointer arrays for batched BLAS
deviceVector<scalar *> holds batch pointer arrays. Populate with acceleratorPut(ptrs[t], base + offset) — sets device-side pointer from host. See A2ASpatialSum::Allocate.