Antonin Portelli portelli
  • Joined on 2017-10-25
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-05-22 01:54:17 +01:00
b58a1508fa Perlmutter cuda version update
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-21 17:44:16 +01:00
12e3499b6d Updated rocm 7 compile for ORNL
9576011011 Changed setup for ROCM 7, nasty LD_LIBRARY_PATH issues were committing
155b34c1aa File list lost
982ffe9ebe Lattice_reduction_gpu: demote timing logs to Debug, disable by default
Compare 4 commits »
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-05-21 17:44:16 +01:00
4d527e81fa Remove hip specific files
7803580aa6 Lattice_reduction_gpu: demote timing logs to Debug, disable by default
32654db366 Test_planned_fft: fix PlannedFFT template parameter to use ::vector_object
cd340cfab3 tests: add Test_planned_fft exercising PlannedFFT<vobj>
f32866b2ff tests/fft: remove PlanDestroy calls (FFT handles plans per-call)
Compare 42 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-21 01:24:16 +01:00
0251ecaeab Test_planned_fft: fix PlannedFFT template parameter to use ::vector_object
372a27d645 tests: add Test_planned_fft exercising PlannedFFT<vobj>
72b4a061f3 tests/fft: remove PlanDestroy calls (FFT handles plans per-call)
29198efabe FFT: add FFTbase, PlannedFFT; factor FFT_dim_execute free function
Compare 4 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-20 09:04:30 +01:00
50aa51f93a debug: add Test_hipfft_repro — reproducer for hipFFT PARSE_ERROR on ROCm 7
79ccc81a86 tests/debug: add G=4 to hipfft fail reproducer
3f0fdbb597 tests/debug: test hipMemset variant before cache is populated
ea57bd8f03 tests/debug: extend hipfft fail reproducer with hipMemset and sync variants
bdba5b8403 FFT: use host stack buffer in PlanCreate, not deviceVector
Compare 7 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-20 00:54:30 +01:00
ad9d03fd85 tests/debug: extend hipfft reproducer with Grid-realistic howmany and exec tests
4de160ce20 tests/debug: add minimal hipfft plan-creation reproducer
fc8c8ce6e7 FFT HIP: use hipfftCreate+hipfftMakePlanMany instead of hipfftPlanMany
ddbb7f07c8 FFT: pass nullptr for inembed/onembed in hipfftPlanMany to avoid HIPFFT_PARSE_ERROR
1e29c59bcc FFT: cache plans per vobj type across calls
Compare 6 commits »
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-05-20 00:54:30 +01:00
a5a04929fb Merge pull request #492 from giltirn/develop
77b8657fcc Fixes to support CUDA > 13. Specifically, the CUDA header is no longer accidentally included within Grid's namespace, and the breaking change to cub::Sum() -> ::cuda::std::plus<>{} in CUDA-13 has been worked around
Compare 2 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-19 16:44:30 +01:00
2fadd8bb62 Accelerator: raise default accelerator_threads from 2 to 16
60df2dd5d0 skills: add gpu-memory-performance.md
66b529b345 sumD_gpu_reduce_words: fuse pack+reduce into single packReduceKernel
1304172a93 Modified repack
Compare 4 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-19 08:34:32 +01:00
1315d4604d Enable GRID_REDUCTION_TIMING unconditionally
a31af31328 Lattice_reduction_gpu: add GRID_REDUCTION_TIMING instrumentation
26c3c7d8f9 sumD_gpu_large: radix-12 word-bundle reduction replacing radix-1
0650d7c7eb Lattice_reduction_sycl: fix double-precision accumulation in sumD_gpu_tensor
068f95ad2d Revert to hand-rolled reduction; drop Lattice_reduction_gpu_cub.h
Compare 7 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-19 00:24:32 +01:00
747c167658 sumD_gpu_direct: one thread per SIMD lane using extractLane
fca2c5dba0 Lattice_reduction_gpu_cub: define GRID_REDUCTION_TIMING in header
e12bc7f07c Lattice_reduction_gpu_cub: add GRID_REDUCTION_TIMING instrumentation
dc6ae51cab Lattice_reduction_gpu_cub: replace WordBundle4 with iVector<iScalar<scalarD>,4>
baa70d8ec9 Test_reduction: add timing benchmark for new vs old reduction paths
Compare 8 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-16 07:04:32 +01:00
003fec509c Fix Zero() used on thrust::complex in WordBundle4 initialisation
portelli synced new reference refs/heads/feature/reduction-reorganisation to portelli/Grid from mirror 2026-05-15 22:54:30 +01:00
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-05-15 22:54:30 +01:00
f8b2eacf99 File list issue (Ed Bennets pull request?)
6140ac6864 Hip Happy
c6c2834e03 Hip Happy
856545a1db Support ROCM 7.0.2
Compare 4 commits »
portelli synced commits to refs/heads/feature/reduction-reorganisation at portelli/Grid from mirror 2026-05-15 22:54:30 +01:00
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-05-07 02:34:30 +01:00
e2d607f6c7 Merge pull request #490 from jdmaia/hip-guard-acceleratorfor2dNB
66da4e0657 Including guard on accelerator_for2dNB against invalid kernel configurations if GRID_HIP
Compare 2 commits »
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-04-28 06:26:00 +01:00
b37390bb5a 4 node usqcd run
829dc8cceb 32 node
13cc2c39f5 FOM run
Compare 3 commits »
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-04-27 22:16:01 +01:00
66ea3b271c Merge branch 'develop' of https://github.com/paboyle/Grid into develop
d293b58a20 384 node baseline run
ce093b2bf3 rdtsc
e4404efe5a Perlmutter compile update
Compare 4 commits »
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-04-21 19:16:13 +01:00
5ce270f1de Adding Claude related files
af43b067a0 New CLAUDE controllable visualiser
Compare 2 commits »
portelli synced commits to refs/heads/develop at portelli/Grid from mirror 2026-04-03 02:26:09 +01:00
34b44d1fee New file for animation in MD time direction
portelli synced commits to refs/heads/KS_shifted at portelli/Grid from mirror 2026-03-24 04:56:10 +00:00
09aa843984 Changed batchedInnerProduct for portability
24752002fa Verbosity reduction batched inner product for reorthogonalization
Compare 2 commits »