6240e02619
added assertion to avoid potential infinite loop
2020-04-27 18:50:53 +01:00
f4033ad8cb
baryon speedup by a factor 2
2020-04-27 17:46:14 +01:00
c2c3cad20d
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-04-23 04:35:42 -04:00
edec9ee2e2
Conserved current rewrite done. Zmobius working
2020-04-23 04:34:01 -04:00
39b448affb
Merge remote-tracking branch 'origin/develop' into feature/a64fx-2
2020-04-22 17:34:12 +02:00
181709bba4
Merge branch 'develop' into feature/zmobius_paramcompute
2020-04-20 09:12:34 -04:00
64b72fc17f
testing gcc 10.0.1: build errors in Exchange1 using -DA64FX and in Lattice_base.h building Dslash only
2020-04-19 01:25:40 +02:00
6fdce60492
revised BodyA64FX; 990 GiB/s Wilson, 687 GiB/s DW using intrinsics (armclang 20.0)
2020-04-16 22:43:32 +02:00
0475c46ecb
Merge pull request #256 from djm2131/feature/BiCGSTAB
...
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
327da332bb
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/gpt
2020-04-16 11:30:17 -04:00
6504a098cc
999 GiB/s Wilson; 694 GiB/s DW (DP)
2020-04-15 15:06:52 +02:00
c12a67030a
980 GiB/s Wilson; 680 GiB/s DW (DP)
2020-04-15 10:55:06 +02:00
581392f2f2
now with pf, best results so far using intrinsics+pf
2020-04-12 22:06:14 +02:00
113f277b6a
enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl
2020-04-11 04:55:01 +02:00
974586bedc
Dslash finally works; cleaned up; uses MOVPRFX in assembly
2020-04-10 22:26:40 +02:00
8e81a811d0
Merge branch 'feature/hdcr' into develop
2020-04-10 11:14:49 -04:00
160f78c1e4
changed debug output to variable direct 3
2020-04-10 12:23:07 +02:00
7e4e1bbbc2
changed debug output to variable direct 2
2020-04-10 12:22:04 +02:00
e699b7e9f9
changed debug output to variable direct
2020-04-10 12:18:30 +02:00
a28bc0de90
debug register address test in WilsonHand
2020-04-10 12:07:45 +02:00
14d0fe4d6c
added predication in WilsonHand
2020-04-10 12:04:00 +02:00
0ad2e0815c
debug output in WilsonHand
2020-04-10 11:56:29 +02:00
dc9c8340bb
switched to DSLASHINTRIN for A64FX Dslash intrinsics
2020-04-09 23:30:23 +02:00
19eef97503
specialized A64FX Dslash kernels
2020-04-09 23:25:25 +02:00
5cdbb7e71e
fixed A64FX Dslash; compiles, but does not specialize -> assertion
2020-04-09 21:23:39 +02:00
86c9c4da8b
changes
2020-04-09 16:40:06 +02:00
bd310932f7
changes
2020-04-09 16:32:31 +02:00
77fa586f6c
introduced A64FX Wilson kernels
2020-04-09 13:30:06 +02:00
2c22db841a
Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions
2020-04-02 17:38:47 +01:00
b6cbdd2aa3
Merge pull request #1 from DanielRichtmann/feature/read-openqcd
...
Feature/read openqcd
2020-03-26 17:39:04 +01:00
a2188ea875
remove debugging printf from WilsonKernelsImplementation
2020-03-26 09:12:36 -04:00
989af65807
Check in parallel reader for openqcd configs
2020-03-24 11:20:54 +01:00
c9b737a4e7
make trace,adj,transpose unary operators
2020-03-16 17:58:30 -04:00
037bb6ea73
Check in reader for openqcd configs
...
This reader is suboptimal in the sense that it opens the entire config on every MPI rank.
2020-03-16 14:28:02 +01:00
7c061e20c9
All directions of dirac operator for fastt coarsening
2020-01-27 12:40:13 -05:00
e5d1c09665
Faster DhopDirAll for little dirac operator coarsening
2020-01-27 12:38:54 -05:00
8016a465ae
Remove extraneous variable
2020-01-27 12:35:37 -05:00
d8b9742092
DhopDirAll for faster matrix elements of little Dirac operator
2020-01-27 12:34:54 -05:00
96671bbb24
Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
...
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
e583035614
Change to interface to minise comms in evaluating coarse space operator
2020-01-06 11:43:59 -05:00
3c3d6a94f3
OPtimising the force term a bit
2020-01-04 03:16:23 -05:00
039eb7b2eb
Make the force term and coarsening multigrid more optimised
2020-01-04 03:12:17 -05:00
f7373e97a4
Missing conjugate in MooeeInvDag
2019-12-16 10:05:50 +01:00
848079e8ba
Merge pull request #235 from grid-test-organisation/feature/5d-improvement
...
MooeeInv and M5D optimisations + enable threading with nvcc
2019-12-10 21:45:03 -05:00
4180a4a8a7
Import BiCGSTAB solvers and tests
2019-12-10 17:20:35 -05:00
6446671a9c
Merge pull request #241 from nils-asmussen/fix/remQCDns_ignore_ws
...
Undo whitespace changes in fix/removeQCDremnants to allow comparing relevant changes
2019-12-09 18:02:21 +00:00
9b6b0caa55
Junk commit fix
2019-12-09 03:01:58 -05:00
2a48617ac5
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2019-12-09 03:00:00 -05:00
3d2fe80780
Temporary size depends on checkerboard/uncheckerboard. The Mdir cares
2019-12-09 02:58:24 -05:00
f7698b93ca
corrected comments about quark line directions
2019-12-06 09:46:52 +00:00