TODO list update

2025-07-27 01:37:07 +01:00 · 2019-06-15 12:54:27 +01:00
parent cb336aa8f8
commit f710d7bd45
1 changed files with 22 additions and 6 deletions
--- a/28
+++ b/28
@@ -1,17 +1,27 @@
+- Lattice_arith - are the mult, mac etc.. still needed after ET engine?
+- LinalgUtils  ssp loop not offloaded
+- Mobius/Domain EOFA cache header implementaiotn has thread_loop
+- ImprovedStaggered accelerate
+- Lattice_reduction - remnant thread_loops must offload. Audit thread_loop in main code for non-accelerated code  
+  Lattice_rng
+  Lattice_transfer.h

+- Stencil.h : Thread loops in exchange code. Need to offload these
+
+- Lebesque order reintroduction. StencilView should have pointer
+
+- accelerate A2Autils

 GPU branch code item work list
 -----------------------------

+7) Accelerate the cshift
+
 * 0) Single GPU
 - 128 bit integer table load in GPU code.
 - coalescedRead <- threadIdx.x
 - Gianluca's changes to Cayley into gpu-port
 - GPU accelerate EOFA
- Clean up PRAGMAS, and SIMT_loop
-  thread_loop interface revisit.
-  for_n
-  for
 - Staggered kernels -> GPU coalesced loop
 - Staggered kernels inline for GPU -- DONE

@@ -23,9 +33,12 @@ GPU branch code item work list

 * 3) Comms/NVlink
 - OpenMP tasks to run comms threads. 
+- Remove explicit openMP in staggered. 
 - Single parallel region around both the Kernel call
  and the comms.
 - Fix the halo exchange SIMT loop
+- Stencil gather
+- SIMD dirs in stencil

 * 4) ET enhancements
 - eval -> scalar ops in ET engine
@@ -35,9 +48,7 @@ GPU branch code item work list

 - Conserved current clean up.
 - multLinkProp eliminate
- SIMD dirs in stencil
 
-7) Accelerate the cshift

 8) Merge develop and test HMC

@@ -50,6 +61,11 @@ GPU branch code item work list


 =============================================================================================
+- Clean up PRAGMAS, and SIMT_loop                                      -- DONE
+  thread_loop interface revisit.
+  _foreach
+  _for
+
 -- Figure what to do about "multLinkGpu" etc.. in FermionOperatorImpl. -- DONE
 -- Gparity is the awkward one                                          -- DONE
 -- Solve non-Gparity first.                                            -- DONE