8083e3f7e8
Sign factor for tadpole implementation corrected.
2021-04-15 11:14:31 +01:00
895244ecc3
Merge with upstream; implemented conserved tadpole for Shamir action.
2021-04-06 13:46:33 +01:00
addeb621a7
Implemented tadpole operator for Shamir action.
2021-04-06 13:45:37 +01:00
Peter Boyle
bb89a82a07
Staggered coalseced read
2021-03-29 20:01:15 +02:00
Peter Boyle
9c2b37218a
sRNG parameter added
2021-03-18 06:24:11 -04:00
Peter Boyle
51f506553c
Read out the local ID once, and store
2021-03-12 15:33:04 +01:00
u61464
0e21adb3f6
Gives 200GF/s on SyCL/DG1 8^4, doesn't uglify develop for other platforms too badly.
...
Easy to revert to clean more C++ stylistic code. Theres a SYCL_HACK macro I will clean up later once dpcpp
evolves a central nervous systems.
2021-03-10 05:40:51 -08:00
Peter Boyle
a9604367c1
Merge pull request #336 from lehner/feature/gpt
...
Make ShmDims configurable; adjust GRID_MAX_SIMD to allow for 128 byte width on GPUs
2021-03-05 13:17:19 -05:00
7a19432e0b
whitespace
2021-03-05 10:57:09 +00:00
9b15704290
tested and consitent
2021-03-05 10:42:32 +00:00
3b06e4655e
Merge branch 'develop' into feature/XiToSigma
2021-03-04 20:06:16 +00:00
d4b4de8f42
changes
2021-03-04 20:01:24 +00:00
Peter Boyle
c90beee774
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2021-03-03 23:50:29 +01:00
Peter Boyle
1eea9d73b9
Pass serial RNG around
2021-03-03 23:50:01 +01:00
u61464
679d1d22f7
Sycl happier
2021-03-03 11:21:43 -08:00
Peter Boyle
442336bd96
Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case.
...
Other cases to do. This now includes comms code path.
2021-03-02 14:50:51 +01:00
Christoph Lehner
9c9566b9c9
Merge pull request #23 from paboyle/develop
...
Sync
2021-03-01 12:33:51 +01:00
Christopher Kelly
c791cb2214
Merge branch 'develop' into feature/link-update-mask
2021-02-23 11:51:54 -05:00
Christopher Kelly
d5ab571a89
Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces
...
Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC
Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
2021-02-23 11:49:56 -05:00
0ed800f6e4
merge develop
2021-02-23 14:54:46 +00:00
Peter Boyle
0a32183825
Merge pull request #335 from felixerben/gpu/baryons
...
Gpu/baryons
2021-02-23 09:30:16 -05:00
Daniel Richtmann
e3d019bc2f
Enable performance counting in WilsonFermion like in others
2021-02-22 15:25:40 +01:00
7ae030f585
changed back A2AUtils warning
2021-02-18 13:24:50 +00:00
86b58d5aff
changed if and accelerator_for - no runtime errors any more
2021-02-18 12:04:32 +00:00
Peter Boyle
eda9ab487b
MADWF 5d source option for hadrons - look at Grid of source
...
Abort on GPU error
2021-02-08 10:47:22 -05:00
9b9a53f870
...
2021-02-02 13:06:43 +00:00
a673b6a54d
prettify
2021-01-28 14:15:09 +00:00
1bf2e4d187
Merge branch 'develop' into gpu/baryons
2021-01-27 21:17:37 +00:00
81d88d9f4d
fixes
2021-01-27 21:09:51 +00:00
Peter Boyle
69f1f04f74
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2021-01-21 21:39:59 -05:00
Peter Boyle
ff1fa98808
Fix for GPU conserveed current
2021-01-21 21:38:23 -05:00
df16202865
weird bug in 2pt function...
2021-01-19 19:25:27 +00:00
3ff7c2c02a
Merge branch 'develop' into gpu/baryons
2021-01-19 12:34:13 +00:00
8bfa0e74f8
final version, tested on CPU and GPU
2021-01-19 12:27:57 +00:00
9b73a937e7
bugfix
2021-01-18 18:57:05 +00:00
Peter Boyle
56111bb823
Merge branch 'develop' into feature/conjugate-bc-dirs
2021-01-14 21:01:22 -05:00
Peter Boyle
99445673f6
Gparity fix, and plaquette IO
2021-01-14 21:00:36 -05:00
Peter Boyle
d8fa903b02
G5 on coarse spaces
2021-01-14 20:47:28 -05:00
Peter Boyle
eaff0f3aeb
Gamma5 on coaree spaces
2021-01-14 20:46:58 -05:00
Peter Boyle
e8e20c01b2
Coarsened vector test
2021-01-14 20:46:21 -05:00
fa12b9a329
bugfix
2021-01-13 10:04:17 +00:00
45fc7ded3a
test for sum
2021-01-12 09:10:37 +00:00
74de2d9742
whitespace changes
2021-01-08 18:28:36 +00:00
e759367d42
tested and working
2021-01-08 18:04:50 +00:00
Christoph Lehner
299d0de066
Merge pull request #21 from paboyle/develop
...
Sync
2020-12-22 20:59:15 +01:00
Nils Meyer
45d49d8648
clean up
2020-12-19 03:35:18 +01:00
Nils Meyer
3f9ae6e7e7
Merge branch 'develop' into feature/a64fx-3
2020-12-19 02:37:11 +01:00
Nils Meyer
4dd9e39e0d
up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1
2020-12-19 00:54:31 +01:00
f36d6f3923
compiles on GPU. 3pt still wrong!!!!
2020-12-17 17:04:08 +00:00
Michael Marshall
873519e960
Enable existing conserved current code for CUDA (compiles OK for CUDA 10.1). Add option to Test_cayley_mres to load a configuration
2020-12-14 16:06:10 +00:00