1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 01:35:36 +00:00

TODO list update

This commit is contained in:
Peter Boyle 2019-06-15 12:54:27 +01:00
parent cb336aa8f8
commit f710d7bd45

28
TODO
View File

@ -1,17 +1,27 @@
- Lattice_arith - are the mult, mac etc.. still needed after ET engine?
- LinalgUtils ssp loop not offloaded
- Mobius/Domain EOFA cache header implementaiotn has thread_loop
- ImprovedStaggered accelerate
- Lattice_reduction - remnant thread_loops must offload. Audit thread_loop in main code for non-accelerated code
Lattice_rng
Lattice_transfer.h
- Stencil.h : Thread loops in exchange code. Need to offload these
- Lebesque order reintroduction. StencilView should have pointer
- accelerate A2Autils
GPU branch code item work list GPU branch code item work list
----------------------------- -----------------------------
7) Accelerate the cshift
* 0) Single GPU * 0) Single GPU
- 128 bit integer table load in GPU code. - 128 bit integer table load in GPU code.
- coalescedRead <- threadIdx.x - coalescedRead <- threadIdx.x
- Gianluca's changes to Cayley into gpu-port - Gianluca's changes to Cayley into gpu-port
- GPU accelerate EOFA - GPU accelerate EOFA
- Clean up PRAGMAS, and SIMT_loop
thread_loop interface revisit.
for_n
for
- Staggered kernels -> GPU coalesced loop - Staggered kernels -> GPU coalesced loop
- Staggered kernels inline for GPU -- DONE - Staggered kernels inline for GPU -- DONE
@ -23,9 +33,12 @@ GPU branch code item work list
* 3) Comms/NVlink * 3) Comms/NVlink
- OpenMP tasks to run comms threads. - OpenMP tasks to run comms threads.
- Remove explicit openMP in staggered.
- Single parallel region around both the Kernel call - Single parallel region around both the Kernel call
and the comms. and the comms.
- Fix the halo exchange SIMT loop - Fix the halo exchange SIMT loop
- Stencil gather
- SIMD dirs in stencil
* 4) ET enhancements * 4) ET enhancements
- eval -> scalar ops in ET engine - eval -> scalar ops in ET engine
@ -35,9 +48,7 @@ GPU branch code item work list
- Conserved current clean up. - Conserved current clean up.
- multLinkProp eliminate - multLinkProp eliminate
- SIMD dirs in stencil
7) Accelerate the cshift
8) Merge develop and test HMC 8) Merge develop and test HMC
@ -50,6 +61,11 @@ GPU branch code item work list
============================================================================================= =============================================================================================
- Clean up PRAGMAS, and SIMT_loop -- DONE
thread_loop interface revisit.
_foreach
_for
-- Figure what to do about "multLinkGpu" etc.. in FermionOperatorImpl. -- DONE -- Figure what to do about "multLinkGpu" etc.. in FermionOperatorImpl. -- DONE
-- Gparity is the awkward one -- DONE -- Gparity is the awkward one -- DONE
-- Solve non-Gparity first. -- DONE -- Solve non-Gparity first. -- DONE