c4d9aa1a21
Config command that makes GPT happier
2025-02-27 20:12:49 +00:00
6ae809ed40
Print not liked on GPT compile
2025-02-27 20:12:49 +00:00
Peter Boyle
311e2aab3f
Update Accelerator.h
2025-02-26 11:42:52 -05:00
438dfbdb83
Only throw if there is a pending list entry in CommsComplete
2025-02-25 16:57:27 +00:00
b2ce760cf4
Verbose issue with GPT
2025-02-25 16:55:23 +00:00
Muhammad Asif
b1ba209696
Latest upstream with np-su3 patch and modified Sp_WilsonFunfFermionGauge test to be small ( #22 )
...
Co-authored-by: Mashy Green <mashy@me.com>
merging no-su3 patch
2025-02-24 11:38:42 +00:00
Muhammad Asif
cb3e529b1e
Merge branch 'paboyle:develop' into develop
2025-02-24 11:29:09 +00:00
Mashy Green
717f647418
added the WilsonFlow patch from upstream PR #471
2025-02-24 08:41:31 +00:00
Mashy Green
98e7418187
Merge remote-tracking branch 'upstream/develop' into gauge_action_deriv
2025-02-24 08:33:05 +00:00
Mashy Green
fe05bf48b1
Improvements to WilsonGaugeAction deriv function ( #16 )
...
* patched version + modifications to deriv -> staple in qcd/gauge
* Cleaning up and aligning variable naming between action deriv versions
* Removing the regresion test files that were also in this branch for a clean PR
* Reverting whitespace changes
* Fixing after revering too much!
---------
Co-authored-by: Mashy Green <mashy@me.com>
2025-02-17 18:52:04 +00:00
Mashy Green
d2dd8f54e2
Fixing after revering too much!
2025-02-17 17:32:27 +00:00
Mashy Green
7726ee4b16
Reverting whitespace changes
2025-02-17 17:16:28 +00:00
ba9bbe0221
Bounce MPI through host
2025-02-12 19:34:59 +00:00
4c3dd82d84
CSHIFT with bounce throuhgh Host memory on MPI packets
2025-02-12 19:09:53 +00:00
44e911b5b7
Comment change
2025-02-12 17:37:55 +00:00
a7a16df9d0
GET not put has kinder barrier sequence for NVLINK type access as when
...
GET is done, I can use it without barrier. Moves a barrier to a nicer
place, overlapped with DtoH DMA
2025-02-12 14:59:28 +00:00
382e0abefd
Was issueing a double fence -- the gather also fences
2025-02-12 14:57:28 +00:00
6fdefe5b90
Barrier sequencing if doing "GET" not "PUT" is different.
...
This is somewhat better timing for Barriers
2025-02-12 14:55:20 +00:00
4788dd8e2e
More states in packet progression for GPU non aware MPI
2025-02-12 14:53:57 +00:00
1cc5f221f3
GET not put ordering is better as I know when I've got all MY data
2025-02-12 14:53:05 +00:00
93251bfba0
GET not put for better ordering in the downstream dependent kernels -- I
...
know when I'm done, so we can move a barrier / handshake between ranks
intranode to a point off critical path
2025-02-12 14:50:21 +00:00
18b79508b8
New line better for pretty print
2025-02-12 14:49:48 +00:00
4de5ed1613
Remove vector view. The std::vector will not inform Memory manager of
...
deletion and so a stale entry could be left. It is not and should not be
used.
2025-02-12 14:48:46 +00:00
0baaddbe98
Pipeline mode commit on Aurora. 5+ TF/s on 16^3x32 per tile at 384
...
nodes.
More concurrency/fine grained scheduling is possible.
2025-02-04 19:27:26 +00:00
Mashy Green
355ec76257
Merge pull request #18 from UCL-ARC/bugfix/nvtx
...
Bugfix/nvtx
2025-02-03 11:05:42 +00:00
b50fb34e71
Perf on Aurora
2025-02-01 18:39:34 +00:00
de84d730ff
Fastest run config on Aurora to date
2025-02-01 18:08:40 +00:00
Peter Boyle
c74d11e3d7
PVdagM MG
2025-02-01 11:04:13 -05:00
Christoph Lehner
84cab5e6e7
no comms and log cleanup
2025-02-01 16:37:21 +01:00
c4fc972fec
Merge branch 'feature/deprecate-uvm' into develop
2025-01-31 16:32:36 +00:00
8cf809e231
Best results on Aurora so far
2025-01-31 16:14:45 +00:00
94019a922e
Significantly better performance on Aurora without using pipeline mode
2025-01-30 16:36:46 +00:00
Mashy Green
4f17c8d081
Merge branch 'paboyle:develop' into bugfix/nvtx
2025-01-29 13:10:12 +00:00
Mashy Green
aaab753982
Reverting to older version of nvtx for Tursa support
2025-01-29 12:57:38 +00:00
d6b2727f86
Pipeline mode getting better -- 2 nodes @ 10TF/s per node on Aurora
2025-01-29 09:22:21 +00:00
74a4f43946
Optional host buffer bounce for no CUDA aware MPI
2025-01-28 15:22:46 +00:00
1caf8b0f86
Rename
2025-01-28 15:22:37 +00:00
Chulwoo Jung
570b72a47b
Bugfix. Sorry!
2025-01-21 15:37:39 -05:00
Chulwoo Jung
a5798a89ed
Merge branch 'develop' into specflow
2025-01-21 12:13:24 -05:00
Peter Boyle
3f3661a86f
Heading towards PVdagM multigrid
2025-01-17 14:33:35 +00:00
Chulwoo Jung
f7e2f9a401
Checking in spectral flow and DWF/Mobius kernel eigenvalue measurement
2025-01-16 20:47:33 +00:00
Chulwoo Jung
2848a9b558
DWF Kernel lanczos working(?)
2025-01-16 01:29:56 +00:00
Mashy Green
d4868991af
Fixed wrong lib for NVTX in configure.ac and updated to nvtx3
2025-01-10 14:53:19 +00:00
Mashy Green
e99d42404e
Removing the regresion test files that were also in this branch for a clean PR
2024-12-16 16:31:22 +00:00
Mashy Green
3ba019c747
Cleaning up and aligning variable naming between action deriv versions
2024-12-03 15:23:00 +00:00
Mashy Green
47429218bb
patched version + modifications to deriv -> staple in qcd/gauge
2024-11-27 16:29:22 +00:00
8fe429346f
Dslash testing for reproduce
2024-11-11 23:11:11 +00:00
Peter Boyle
5a4f9bf2e3
Force the ROCM version
2024-10-29 18:12:31 -04:00
Peter Boyle
b91fc1b6b4
Merge branch 'feature/boosted' into feature/deprecate-uvm
...
Fixed boosted free field test
2024-10-28 16:53:09 -04:00
Peter Boyle
eafc150034
Test fft asserts
2024-10-23 16:46:26 -04:00