Grid/Grid/lattice at 747c1676587036f7556eb790672dc70be5bb3385 - Grid - DiRAC Tursa git server

portelli/Grid

mirror of https://github.com/paboyle/Grid.git synced 2026-06-18 18:03:44 +01:00

Files

T

History

Peter Boyle 747c167658 sumD_gpu_direct: one thread per SIMD lane using extractLane

Replaces one thread per outer site calling Reduce() (sequential Nsimd-wide
loop) with one thread per lane calling extractLane() — O(1) per thread.
CUB now reduces over osites*Nsimd elements. Avoids serial lane reduction
but leaves the per-lane sobjD store stride as a known remaining concern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-18 16:21:50 -04:00

..

Lattice_arith.h

Fast axpy norm under CFLAG

2024-10-11 03:23:09 +00:00

Lattice_base.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_basis.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_comparison_utils.h

GPU reductions first cut; use thrust, non-reproducible. Inclusive scan can fix this if desired.

2019-01-01 13:53:37 +00:00

Lattice_comparison.h

Remove dead commented ouot coode

2020-08-31 23:40:29 -04:00

Lattice_conformable.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_coordinate.h

Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.

2020-06-05 18:52:35 -04:00

Lattice_crc.h

Merge branch 'develop' into feature/scidac-wp1

2024-03-06 14:55:21 -05:00

Lattice_ET.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_local.h

Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.

2020-06-05 18:52:35 -04:00

Lattice_matrix_reduction.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_peekpoke.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_real_imag.h

real and imag part not in ET

2020-08-31 23:56:26 -04:00

Lattice_reality.h

happy compile

2020-10-14 22:59:41 -04:00

Lattice_reduction_gpu_cub.h

sumD_gpu_direct: one thread per SIMD lane using extractLane

2026-05-18 16:21:50 -04:00

Lattice_reduction_gpu.h

Rewrite lattice GPU reduction to use CUB, hipCUB, and SYCL reduction

2026-05-15 13:41:56 -04:00

Lattice_reduction_sycl.h

Rewrite lattice GPU reduction to use CUB, hipCUB, and SYCL reduction

2026-05-15 13:41:56 -04:00

Lattice_reduction.h

Rewrite lattice GPU reduction to use CUB, hipCUB, and SYCL reduction

2026-05-15 13:41:56 -04:00

Lattice_rng.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00

Lattice_slicesum_core.h

No compile fix

2025-04-04 18:35:05 -04:00

Lattice_trace.h

Merge remote-tracking branch 'LupoA/develop' into LupoA-develop

2023-10-02 16:22:35 -04:00

Lattice_transfer.h

Missed one

2025-08-14 20:25:54 +00:00

Lattice_transpose.h

Merge branch 'develop' into sycl

2020-06-09 04:00:12 -04:00

Lattice_unary.h

Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.

2020-06-05 18:52:35 -04:00

Lattice_view.h

Updated to compile and run fast on CUDA

2025-08-10 00:00:13 +01:00

Lattice_where.h

Update thread issue

2021-03-12 14:55:07 +01:00

Lattice.h

Hack for flight logging CG inner products.

2024-03-05 23:59:57 +00:00

PaddedCell.h

Assertion updates to macros (mostly) with backtrace.

2025-08-07 15:48:38 +00:00