Chulwoo Jung
dc6a38f177
Minor cleanup
2022-11-30 17:13:12 -05:00
Chulwoo Jung
82c1ecf60f
Block lanczos added
2022-11-30 16:08:40 -05:00
Peter Boyle
584a3ee45c
Merge pull request #412 from giltirn/patch/adaptive-wflow
...
Patch/adaptive wflow
2022-10-04 17:23:19 -04:00
Peter Boyle
eec0c9eb7d
Merge pull request #411 from giltirn/patch/dirichlet-fixes
...
Various fixes / changes
2022-10-04 17:22:01 -04:00
Christopher Kelly
66d001ec9e
Refactored Wilson flow class; previously the class implemented both iterative and adaptive smearing, but only the iterative method was accessible through the Smearing base class. The implementation of Smearing also forced a clunky need to pass iterative smearing parameters through the constructor but adaptive smearing parameters through the function call. Now there is a WilsonFlowBase class that implements common functionality, and separate WilsonFlow (iterative) and WilsonFlowAdaptive (adaptive) classes, both of which implement Smearing virtual functions.
...
Modified the Wilson flow adaptive smearing step size update to implement the original Ramos definition of the distance, where previously it used the norm of a difference which scales with the volume and so would choose too coarse or too fine steps depending on the volume. This is based on Chulwoo's code.
Added a test comparing adaptive (with tuneable tolerance) to iterative Wilson flow smearing on a random gauge configuration.
2022-10-03 10:59:38 -04:00
Christopher Kelly
19da647e3c
Added support for non-periodic gauge field implementations in the random gauge shift performed at the start of the HMC trajectory
...
(The above required exposing the gauge implementation to the HMC class through the Integrator class)
Made the random shift optional (default on) through a parameter in HMCparameters
Modified ConjugateBC::CshiftLink such that it supports any shift in -L < shift < L rather than just +-1
Added a tester for the BC-respecting Cshift
Fixed a missing system header include in SSE4 intrinsics wrapper
Fixed sumD_cpu for single-prec types performing an incorrect conversion to a single-prec data type at the end, that fails to compile on some systems
2022-09-09 12:47:09 -04:00
Peter Boyle
e7d9b75fdd
Warning fixes
2022-08-31 19:01:14 -04:00
Peter Boyle
3d0e3ec363
Tracing
2022-08-31 18:31:46 -04:00
Peter Boyle
3c1c51f9aa
Merge branch 'feature/dirichlet-gparity' into feature/dirichlet
2022-08-31 18:25:34 -04:00
Peter Boyle
8cc3c522c3
Merge pull request #409 from giltirn/feature/dirichlet-gparity-stage
...
Import round 5
2022-08-31 18:22:50 -04:00
Peter Boyle
5c87342108
Used in g-2 sign off
2022-08-31 17:35:32 -04:00
Peter Boyle
66177bfbe2
Used in g-2 sign off
2022-08-31 17:35:07 -04:00
Peter Boyle
5205e68963
RocTX, NVTX, text based self profiling
2022-08-31 17:34:09 -04:00
Peter Boyle
cd5cf6d614
Tracing replaces self timing hooks
2022-08-31 17:33:41 -04:00
Peter Boyle
5abb19eab0
Remove self timing
2022-08-31 17:32:49 -04:00
Peter Boyle
06d7b88c78
Force reporting improved
2022-08-31 17:32:21 -04:00
Peter Boyle
cf72799735
Better action naming
2022-08-31 17:24:11 -04:00
Peter Boyle
cdb8fcc269
Width=4 support. This is too broad; hit it on physical point run.
...
Need to change strategy, I think.
2022-08-31 17:21:33 -04:00
Peter Boyle
b4f4130901
Defer SMP node links until after interior. Allows for DMA overlapping
...
compute
2022-08-31 17:20:21 -04:00
Peter Boyle
bb049847d5
Tracing replaces self timing
2022-08-31 17:19:02 -04:00
Peter Boyle
fd33c835dd
Feynman rule fix and tracing replaces self timing
2022-08-31 17:18:17 -04:00
Peter Boyle
21371a7e5b
Tracing replaces self timing
2022-08-31 17:16:05 -04:00
Peter Boyle
abfaa00d3e
Tracing replaces self timing
2022-08-31 17:15:24 -04:00
Peter Boyle
efee33c55d
Tracing replaces self timing
2022-08-31 17:14:57 -04:00
Peter Boyle
db0fe6ddbb
Tracing replaces self timinng
2022-08-31 17:14:14 -04:00
Peter Boyle
8a9e647120
Tracing replaces self timing
2022-08-31 17:13:44 -04:00
Peter Boyle
e6dcb821ad
Tracing replaces self timing
2022-08-31 17:12:31 -04:00
Peter Boyle
9bff188f02
Tracing replaces self timing
2022-08-31 17:12:05 -04:00
Peter Boyle
111b30ca1d
Tracing replaces self timing
2022-08-31 17:11:48 -04:00
Peter Boyle
24182ca8bf
HIP allows conserved currents.
...
Tracing replaces self timeing
2022-08-31 17:11:18 -04:00
Peter Boyle
ee2d7369b3
Tracing replaces self timing
2022-08-31 17:10:45 -04:00
Peter Boyle
7c686d29c9
Tracing replaces self timing
2022-08-31 17:10:17 -04:00
Peter Boyle
e8a0a1e75d
Tracing replaces self timing hooks
2022-08-31 17:09:47 -04:00
Peter Boyle
730be89abf
Remove timing hooks as tracing replaces
2022-08-31 17:08:44 -04:00
Peter Boyle
f991ad7d5c
Remove timing hooks as tracing replaces
2022-08-31 17:08:18 -04:00
Peter Boyle
b3f33f82f7
Decrease self timing hooks, use nvtx / roctx type tracing hooks instead
2022-08-31 17:06:47 -04:00
Peter Boyle
a34a6e059f
Logging improvement. Sinitial will be used to improve RHMC terms
2022-08-31 17:06:08 -04:00
Peter Boyle
1333319941
Tracing
2022-08-31 17:00:25 -04:00
Peter Boyle
9295ed8d20
Print full memory range
2022-08-31 16:59:51 -04:00
Peter Boyle
19cc7653fb
Tracing
2022-08-31 16:57:51 -04:00
Peter Boyle
5752538661
Tracing
2022-08-31 16:57:32 -04:00
Peter Boyle
ca40a1b00b
Tracing
2022-08-31 16:54:55 -04:00
Peter Boyle
659fac9dfb
Tracing hook
2022-08-31 16:54:25 -04:00
Peter Boyle
4dc3d6fce0
Buy into Nvidia/Rocm etc... tracing.
2022-08-31 16:53:19 -04:00
Peter Boyle
95b640cb6b
10TF/s on 32^3 x 64 on single node
2022-08-04 15:43:52 -04:00
Peter Boyle
2cb5bedc15
Copy stream HIP improvements
2022-08-04 15:24:03 -04:00
Peter Boyle
806b02bddf
Simplify dead code
2022-08-04 15:23:13 -04:00
Peter Boyle
de40395773
More timing. Think I should start to use nvtx and rocmtx ??
2022-08-04 13:37:16 -04:00
Peter Boyle
7ba4788715
Fix
2022-08-04 13:36:44 -04:00
Peter Boyle
06d9ce1a02
Synch ranks on node here for GPU - GPU memcopy
2022-08-04 13:35:56 -04:00