1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-09 23:45:36 +00:00

Udpdate TODO afer gianluc marge

This commit is contained in:
Peter Boyle 2019-05-18 22:58:25 +01:00
parent ee6f96d85c
commit a9342c6ae5

40
TODO
View File

@ -1,32 +1,60 @@
GPU branch code item work list
-----------------------------
TODO:
---------------
- Investigate why slower than september
- Common source GPU and CPU generic kernels???
- coalescedRead, coalescedWrite in expressions.
- Uniform coding between GPU kernels and CPU kernels attempt
- SIMD dirs in stencil
- Merge develop and test HMC
- GPU accelerate EOFA
- Make GPU offload reductions optionally deterministic
- Accelerate the cshift
- Accelerate non-dslash elements of Mobius; check accelerator_loop uniformly used in fermion operators
- Gamma tables on GPU
- Gamma tables on GPU; check this.
- Mobius kernel fusion.
- Reread WilsonKernels and check diffs
- thread_loop interface revisit.
- pragma once uniformly
- Audit changes
- Audit NAMESPACE CHANGES
- Verify HMC one flavour ratio; suspect dH too big; verify timestep with Guido.
- Staggered kernels inline for GPU
- Single GPU simd target (VGPU)
-----
Gianluca's changes
- Performance impact of construct in aligned allocator???
- Inner product compare to Summit inner product optimisation
- CayleyFermion5D.cc - flop count line 166 odd. Shouldn't depend on arch
- - Review Vector use
- CayleyFermion5D.h - DperpGPU unify coding style
---------
- Lebesgue reorder in all kernels
- merge2 where is it used. Audit routines, comment out and check compile.
- AVX512 still broken, lebesgue order missing ?
- Neon ??
DONE:
-----------------------------
- Committed my modifications
- Accelerate non-dslash elements of Mobius; check accelerator_loop uniformly used in fermion operators
- Merged Gianluca modifications
- Verify HMC one flavour ratio
- GPU offload reductions: using thrust::reduce?
- Deprecate JSON.
- pugixml difficult.