1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-09 21:50:45 +01:00

Update todo list

This commit is contained in:
Peter Boyle 2019-01-01 13:48:06 +00:00
parent 4a96c067ae
commit bf5685eb11

62
TODO
View File

@ -1,18 +1,45 @@
TODO: TODO:
--------------- ---------------
GPU branch code item work list GPU branch code item work list
----------------------------- -----------------------------
- Audit NAMESPACE CHANGES
- Audit HMC timestep / traj length size
- Verify HMC one flavour ratio; suspect dH too big
- pragma once uniformly
- GPU offload reductions; thrust initial ; inclusive_scan vs reduce? - GPU offload reductions; thrust initial ; inclusive_scan vs reduce?
- Audit changes - Accelerate the cshift
- thread_loop interface revisit. - Accelerate non-dslash elements of Mobius; check accelerator_loop uniformly used in fermion operators
- Gamma tables on GPU
- Mobius kernel fusion.
- Reread WilsonKernels and check diffs
- thread_loop interface revisit.
- pragma once uniformly
- Audit changes
- Audit NAMESPACE CHANGES
- Verify HMC one flavour ratio; suspect dH too big; verify timestep with Guido.
- Staggered kernels inline for GPU
- Single GPU simd target (VGPU)
- Lebesgue reorder in all kernels
- merge2 where is it used. Audit routines, comment out and check compile.
- AVX512 still broken, lebesgue order missing ?
- Neon ??
-----------------------------
Physics item work list:
2)- Consistent linear solver flop count/rate -- PARTIAL, time but no flop/s yet
4)- Multigrid Wilson and DWF, compare to other Multigrid implementations
5)- HDCR resume
-----------------------------
DONE
- Audit HMC timestep / traj length size
- GPU offload reductions; thrust initial ; inclusive_scan vs reduce?
- Pragmas.h - prune and remove strong_inline (?)
- GPU offload reductions; thrust initial ; inclusive_scan vs reduce?
- Remove old parallel_for macros, fix errors
- - Need (1) omp parallel for <-- thread_loop - - Need (1) omp parallel for <-- thread_loop
- - (2) omp for - - (2) omp for
- - (3) omp for collapse(n) - - (3) omp for collapse(n)
@ -37,27 +64,6 @@ GPU branch code item work list
and same "in_region". and same "in_region".
- Remove old parallel_for macros, fix errors
- check accelerator_loop uniformly used in fermion operators
- Gamma tables on GPU
- Accelerate the cshift
- Accelerate non-dslash elements of Mobius
- Mobius kernel fusion.
- Staggered kernels inline for GPU
- Reread WilsonKernels and check diffs
- Single GPU simd target (VGPU)
- Lebesgue reorder in all kernels
- merge2 where used. Audit routines, comment out and check compile.
- Pragmas.h - prune and remove strong_inline (?)
-
-----------------------------
Physics item work list:
2)- Consistent linear solver flop count/rate -- PARTIAL, time but no flop/s yet
4)- Multigrid Wilson and DWF, compare to other Multigrid implementations
5)- HDCR resume
-----------------------------
Nov 2018 Nov 2018
1)- BG/Q port and check ; Andrew says ok. 1)- BG/Q port and check ; Andrew says ok.