Peter Boyle
551a5f8dc8
RRII gpu option
2022-10-11 14:44:55 -04:00
Peter Boyle
c82b164f6b
Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet
2022-10-04 17:41:48 -04:00
Peter Boyle
584a3ee45c
Merge pull request #412 from giltirn/patch/adaptive-wflow
...
Patch/adaptive wflow
2022-10-04 17:23:19 -04:00
Peter Boyle
eec0c9eb7d
Merge pull request #411 from giltirn/patch/dirichlet-fixes
...
Various fixes / changes
2022-10-04 17:22:01 -04:00
Peter Boyle
477ebf24f4
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2022-10-04 11:19:43 -07:00
Peter Boyle
0d5639f707
Run script update
2022-10-04 11:13:41 -07:00
Peter Boyle
413312f9a9
Benchmark the halo construction.
...
THe bye counts are out and should be doubled for SIMD directions
2022-10-04 11:12:59 -07:00
Peter Boyle
03508448f8
Remove verbose
2022-10-04 11:12:15 -07:00
Peter Boyle
e1e5c75023
Stencil gather improvements - SVM was running slow and used for a pointer array that wasn't needed to be in SVM
2022-10-04 11:11:10 -07:00
Peter Boyle
9296299b61
Better commenting
2022-10-04 11:10:34 -07:00
Christopher Kelly
66d001ec9e
Refactored Wilson flow class; previously the class implemented both iterative and adaptive smearing, but only the iterative method was accessible through the Smearing base class. The implementation of Smearing also forced a clunky need to pass iterative smearing parameters through the constructor but adaptive smearing parameters through the function call. Now there is a WilsonFlowBase class that implements common functionality, and separate WilsonFlow (iterative) and WilsonFlowAdaptive (adaptive) classes, both of which implement Smearing virtual functions.
...
Modified the Wilson flow adaptive smearing step size update to implement the original Ramos definition of the distance, where previously it used the norm of a difference which scales with the volume and so would choose too coarse or too fine steps depending on the volume. This is based on Chulwoo's code.
Added a test comparing adaptive (with tuneable tolerance) to iterative Wilson flow smearing on a random gauge configuration.
2022-10-03 10:59:38 -04:00
Peter Boyle
fad2f969d9
Summit up to date
2022-09-27 10:58:43 -04:00
Peter Boyle
48165c1dc1
Ticked off a few items
2022-09-27 10:58:00 -04:00
Peter Boyle
25df2d2c3b
Various precision options
2022-09-27 10:57:12 -04:00
Peter Boyle
af9ecb8b41
Current tests compiling
2022-09-27 10:56:55 -04:00
Peter Boyle
234324599e
Double2
2022-09-27 10:56:10 -04:00
Peter Boyle
97448a93dc
Double2 compiles and dslash runs
2022-09-27 10:55:25 -04:00
Peter Boyle
70c83ec3be
More instantiations
2022-09-27 10:54:23 -04:00
Peter Boyle
8f4e2ee545
Double2
2022-09-27 10:53:46 -04:00
Peter Boyle
e8bfbf2f7c
D2 operators
2022-09-27 10:37:45 -04:00
Peter Boyle
9e81b42981
D2 fields
2022-09-27 10:37:19 -04:00
Peter Boyle
6c9eef9726
D2 fields
2022-09-27 10:36:54 -04:00
Peter Boyle
7ffbc3e98e
Double2 improved. REally don't like 'convertType' - localise to a GPT
...
header
2022-09-27 10:35:31 -04:00
Peter Boyle
68e4d833dd
Run through wrapper script
2022-09-23 16:49:29 -04:00
Peter Boyle
a2cefaa53a
Faster
2022-09-23 16:49:14 -04:00
Peter Boyle
a0d682687e
Better logging of Fdt for force gradient
2022-09-23 16:22:53 -04:00
Peter Boyle
eb552c3ecd
dt info
2022-09-23 16:22:28 -04:00
Peter Boyle
97cce103d7
Tolerances control
2022-09-23 16:21:49 -04:00
Peter Boyle
87ac7104f8
Prettier
2022-09-23 16:20:46 -04:00
Peter Boyle
e4c117aabf
Compile fix, multishift mixed prec support
2022-09-23 16:19:27 -04:00
Peter Boyle
5b128a6f9f
MixedPrec Multishift with better precision scheme for GPU
2022-09-23 16:18:47 -04:00
Christopher Kelly
19da647e3c
Added support for non-periodic gauge field implementations in the random gauge shift performed at the start of the HMC trajectory
...
(The above required exposing the gauge implementation to the HMC class through the Integrator class)
Made the random shift optional (default on) through a parameter in HMCparameters
Modified ConjugateBC::CshiftLink such that it supports any shift in -L < shift < L rather than just +-1
Added a tester for the BC-respecting Cshift
Fixed a missing system header include in SSE4 intrinsics wrapper
Fixed sumD_cpu for single-prec types performing an incorrect conversion to a single-prec data type at the end, that fails to compile on some systems
2022-09-09 12:47:09 -04:00
Peter Boyle
1713de35c0
Improved config flags
2022-09-05 21:50:02 -04:00
Peter Boyle
1177b8f661
Merge branch 'develop' into feature/dirichlet
2022-08-31 19:05:57 -04:00
Peter Boyle
442bfb3d42
Merge branch 'develop' into feature/dirichlet
2022-08-31 19:04:19 -04:00
Peter Boyle
e7d9b75fdd
Warning fixes
2022-08-31 19:01:14 -04:00
Peter Boyle
3d0e3ec363
Tracing
2022-08-31 18:31:46 -04:00
Peter Boyle
3c1c51f9aa
Merge branch 'feature/dirichlet-gparity' into feature/dirichlet
2022-08-31 18:25:34 -04:00
Peter Boyle
8cc3c522c3
Merge pull request #409 from giltirn/feature/dirichlet-gparity-stage
...
Import round 5
2022-08-31 18:22:50 -04:00
Peter Boyle
913fbca74a
Merge pull request #410 from gkanwar/photon_and_sha_patches
...
Photon.h and SHA256 patches
2022-08-31 18:01:45 -04:00
Peter Boyle
5c87342108
Used in g-2 sign off
2022-08-31 17:35:32 -04:00
Peter Boyle
66177bfbe2
Used in g-2 sign off
2022-08-31 17:35:07 -04:00
Peter Boyle
5205e68963
RocTX, NVTX, text based self profiling
2022-08-31 17:34:09 -04:00
Peter Boyle
cd5cf6d614
Tracing replaces self timing hooks
2022-08-31 17:33:41 -04:00
Peter Boyle
5abb19eab0
Remove self timing
2022-08-31 17:32:49 -04:00
Peter Boyle
06d7b88c78
Force reporting improved
2022-08-31 17:32:21 -04:00
Peter Boyle
cf72799735
Better action naming
2022-08-31 17:24:11 -04:00
Peter Boyle
cdb8fcc269
Width=4 support. This is too broad; hit it on physical point run.
...
Need to change strategy, I think.
2022-08-31 17:21:33 -04:00
Peter Boyle
b4f4130901
Defer SMP node links until after interior. Allows for DMA overlapping
...
compute
2022-08-31 17:20:21 -04:00
Peter Boyle
bb049847d5
Tracing replaces self timing
2022-08-31 17:19:02 -04:00