Daniel Richtmann
0b6fd20c54
Enable memory coalescing in clover term generation
2022-02-01 23:09:06 +01:00
Daniel Richtmann
e83423fee6
Refactor clover to align with other files and prepare for upcoming changes
2022-02-01 23:09:06 +01:00
RJHudspith
0bd83cdbda
Fixes for Nc!=3 Nersc IO, Gauge and Gauge_NCxNC compatible with GLU. Trace normalisation changed in places removing explicit threes. Guards against non-su3 tests and tests failing when LIME is not compiled.
2021-11-28 21:51:03 +01:00
Peter Boyle
195ab2888d
Merge branch 'develop' into develop
2021-10-27 20:26:57 -04:00
Peter Boyle
ba7e371b90
Warning free compile on Tursa.
...
Hopefully got all reqd virtual dtors
2021-10-21 19:56:52 +01:00
Ed Bennett
f824d99059
update documentation for GenericHMCRunner
2021-10-18 09:50:16 +01:00
a976fa6746
expose gauge group in GImpl and generic Nc fix
2021-10-05 14:19:47 +01:00
Luchang Jin
4b24800132
AVX512 drop mixed precision as well
2021-09-15 16:29:47 -04:00
Christoph Lehner
3d0f88e702
A64FX drop mixed precision as well
2021-09-15 18:38:32 +02:00
Peter Boyle
86e33c8ab2
Significant GPU perf speed up finished
2021-09-14 16:14:23 +01:00
Peter Boyle
a7b943b33e
Remove half prec comms
2021-09-14 05:05:33 +01:00
Peter Boyle
7440cde92f
No half prec comms; coalesced access on GPU
2021-09-14 05:04:56 +01:00
Peter Boyle
0fc662bb24
Dirac cuda 11.4 happy ; force host for functions accessing mult table
...
ET runs these on host BEFORE lodging result in AST for kernel
2021-09-14 05:00:44 +01:00
Peter Boyle
4c88104a73
Fix compile warns
2021-09-11 23:08:05 +01:00
Peter Boyle
73b944c152
Drop half prec comms for now.
2021-09-11 23:07:18 +01:00
Peter Boyle
d1b0b7f5c6
Half prec comms dropping
2021-09-11 23:05:40 +01:00
Peter Boyle
381d8797d0
Drop half prec comms for now
2021-09-11 23:05:02 +01:00
Andrew Yong
770680669d
Whitespace removal.
2021-08-04 09:21:59 +01:00
Andrew Yong
0cdfc5cf22
Merge remote-tracking branch 'upstream/develop' into develop
2021-07-30 14:40:55 +01:00
u61464
8cfc7342cd
staggered hand unroll read coalesce
2021-05-05 14:17:18 -07:00
cf2923d5dd
Jamie's fix
2021-04-27 16:53:37 +01:00
009ccd581e
bugfix 3D stout smearing
2021-04-26 10:36:33 +01:00
54c6b1376d
Quick fix of conserved current implementation in CayleyFermion5D. Now function treats current insertion with appropriate periodic boundary conditions in the mu=3 direction.
2021-04-21 16:56:46 +01:00
f3f11b586f
Tadpole sign now in front of forward hopping term to be consistent with previous implementation and analytic form.
2021-04-17 12:44:27 +01:00
8083e3f7e8
Sign factor for tadpole implementation corrected.
2021-04-15 11:14:31 +01:00
895244ecc3
Merge with upstream; implemented conserved tadpole for Shamir action.
2021-04-06 13:46:33 +01:00
addeb621a7
Implemented tadpole operator for Shamir action.
2021-04-06 13:45:37 +01:00
Peter Boyle
bb89a82a07
Staggered coalseced read
2021-03-29 20:01:15 +02:00
Peter Boyle
9c2b37218a
sRNG parameter added
2021-03-18 06:24:11 -04:00
Peter Boyle
51f506553c
Read out the local ID once, and store
2021-03-12 15:33:04 +01:00
u61464
0e21adb3f6
Gives 200GF/s on SyCL/DG1 8^4, doesn't uglify develop for other platforms too badly.
...
Easy to revert to clean more C++ stylistic code. Theres a SYCL_HACK macro I will clean up later once dpcpp
evolves a central nervous systems.
2021-03-10 05:40:51 -08:00
Peter Boyle
a9604367c1
Merge pull request #336 from lehner/feature/gpt
...
Make ShmDims configurable; adjust GRID_MAX_SIMD to allow for 128 byte width on GPUs
2021-03-05 13:17:19 -05:00
7a19432e0b
whitespace
2021-03-05 10:57:09 +00:00
9b15704290
tested and consitent
2021-03-05 10:42:32 +00:00
3b06e4655e
Merge branch 'develop' into feature/XiToSigma
2021-03-04 20:06:16 +00:00
d4b4de8f42
changes
2021-03-04 20:01:24 +00:00
Peter Boyle
c90beee774
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2021-03-03 23:50:29 +01:00
Peter Boyle
1eea9d73b9
Pass serial RNG around
2021-03-03 23:50:01 +01:00
u61464
679d1d22f7
Sycl happier
2021-03-03 11:21:43 -08:00
Peter Boyle
442336bd96
Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case.
...
Other cases to do. This now includes comms code path.
2021-03-02 14:50:51 +01:00
Christoph Lehner
9c9566b9c9
Merge pull request #23 from paboyle/develop
...
Sync
2021-03-01 12:33:51 +01:00
Christopher Kelly
c791cb2214
Merge branch 'develop' into feature/link-update-mask
2021-02-23 11:51:54 -05:00
Christopher Kelly
d5ab571a89
Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces
...
Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC
Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
2021-02-23 11:49:56 -05:00
0ed800f6e4
merge develop
2021-02-23 14:54:46 +00:00
Peter Boyle
0a32183825
Merge pull request #335 from felixerben/gpu/baryons
...
Gpu/baryons
2021-02-23 09:30:16 -05:00
Daniel Richtmann
e3d019bc2f
Enable performance counting in WilsonFermion like in others
2021-02-22 15:25:40 +01:00
7ae030f585
changed back A2AUtils warning
2021-02-18 13:24:50 +00:00
86b58d5aff
changed if and accelerator_for - no runtime errors any more
2021-02-18 12:04:32 +00:00
Peter Boyle
eda9ab487b
MADWF 5d source option for hadrons - look at Grid of source
...
Abort on GPU error
2021-02-08 10:47:22 -05:00
9b9a53f870
...
2021-02-02 13:06:43 +00:00