1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-09 23:45:36 +00:00
Grid/benchmarks
Christopher Kelly 1db58a8acc Precision change improvements
Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation.

In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes.

Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces

Renamed the original precisionChange as precisionChangeOrig

Fixed incorrect pointer offset bug in copyLane

Added a test and a benchmark for precisionChange

Added a test for reliable update CG
2023-02-21 10:52:42 -05:00
..
Benchmark_comms_host_device.cc Warning free compile on Tursa. 2021-10-21 19:56:52 +01:00
Benchmark_comms.cc Benchmark_comms fix 2022-11-15 17:00:49 -05:00
Benchmark_dwf_fp32_partial.cc Partial dirichlet changes 2022-11-30 15:51:13 -05:00
Benchmark_dwf_fp32.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Benchmark_dwf_sweep.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Benchmark_dwf.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Benchmark_gparity.cc Tracing replaces self timing hooks 2022-08-31 17:33:41 -04:00
Benchmark_IO_vs_dir.cc Build without LIME 2020-11-17 04:41:15 -08:00
Benchmark_IO.cc Warning free compile on Tursa. 2021-10-21 19:56:52 +01:00
Benchmark_IO.hpp Build without LIME 2020-11-17 04:41:15 -08:00
Benchmark_ITT.cc Tracing replaces self timing hooks 2022-08-31 17:33:41 -04:00
Benchmark_memory_asynch.cc Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
Benchmark_memory_bandwidth.cc Warning free compile on Tursa. 2021-10-21 19:56:52 +01:00
Benchmark_meson_field.cc Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
Benchmark_mooee.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Benchmark_prec_change.cc Precision change improvements 2023-02-21 10:52:42 -05:00
Benchmark_schur.cc Allocator cache spliit into large/small pools 2020-05-10 05:24:26 -04:00
Benchmark_staggered.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Benchmark_staggeredF.cc Single precisiono hardwire 2020-06-05 19:13:27 -04:00
Benchmark_su3_gpu.cc HIP runs sensible 2020-09-16 03:35:03 +01:00
Benchmark_su3.cc HIP runs sensible 2020-09-16 03:35:03 +01:00
Benchmark_wilson_sweep.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Benchmark_wilson.cc Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Makefile.am Better check and benchmark driving 2017-05-05 19:54:38 +01:00
simple_simd_test.cc Makefile rule for simple_* objects 2016-11-19 01:33:13 +01:00
simple_su3_expr.cc GLobal edit for QCD namespace removal & NAMESPACE macros 2018-01-15 09:37:58 +00:00
simple_su3_test.cc Cosmetic 2018-03-24 19:27:14 -04:00