1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-04 19:25:56 +01:00

Update todo list

This commit is contained in:
Peter Boyle 2019-01-01 13:48:06 +00:00
parent 4a96c067ae
commit bf5685eb11

62
TODO
View File

@ -1,18 +1,45 @@
TODO:
---------------
GPU branch code item work list
-----------------------------
- Audit NAMESPACE CHANGES
- Audit HMC timestep / traj length size
- Verify HMC one flavour ratio; suspect dH too big
- pragma once uniformly
- GPU offload reductions; thrust initial ; inclusive_scan vs reduce?
- Audit changes
- thread_loop interface revisit.
- Accelerate the cshift
- Accelerate non-dslash elements of Mobius; check accelerator_loop uniformly used in fermion operators
- Gamma tables on GPU
- Mobius kernel fusion.
- Reread WilsonKernels and check diffs
- thread_loop interface revisit.
- pragma once uniformly
- Audit changes
- Audit NAMESPACE CHANGES
- Verify HMC one flavour ratio; suspect dH too big; verify timestep with Guido.
- Staggered kernels inline for GPU
- Single GPU simd target (VGPU)
- Lebesgue reorder in all kernels
- merge2 where is it used. Audit routines, comment out and check compile.
- AVX512 still broken, lebesgue order missing ?
- Neon ??
-----------------------------
Physics item work list:
2)- Consistent linear solver flop count/rate -- PARTIAL, time but no flop/s yet
4)- Multigrid Wilson and DWF, compare to other Multigrid implementations
5)- HDCR resume
-----------------------------
DONE
- Audit HMC timestep / traj length size
- GPU offload reductions; thrust initial ; inclusive_scan vs reduce?
- Pragmas.h - prune and remove strong_inline (?)
- GPU offload reductions; thrust initial ; inclusive_scan vs reduce?
- Remove old parallel_for macros, fix errors
- - Need (1) omp parallel for <-- thread_loop
- - (2) omp for
- - (3) omp for collapse(n)
@ -37,27 +64,6 @@ GPU branch code item work list
and same "in_region".
- Remove old parallel_for macros, fix errors
- check accelerator_loop uniformly used in fermion operators
- Gamma tables on GPU
- Accelerate the cshift
- Accelerate non-dslash elements of Mobius
- Mobius kernel fusion.
- Staggered kernels inline for GPU
- Reread WilsonKernels and check diffs
- Single GPU simd target (VGPU)
- Lebesgue reorder in all kernels
- merge2 where used. Audit routines, comment out and check compile.
- Pragmas.h - prune and remove strong_inline (?)
-
-----------------------------
Physics item work list:
2)- Consistent linear solver flop count/rate -- PARTIAL, time but no flop/s yet
4)- Multigrid Wilson and DWF, compare to other Multigrid implementations
5)- HDCR resume
-----------------------------
Nov 2018
1)- BG/Q port and check ; Andrew says ok.