Christopher Kelly
181709bba4
Merge branch 'develop' into feature/zmobius_paramcompute
2020-04-20 09:12:34 -04:00
nils meyer
64b72fc17f
testing gcc 10.0.1: build errors in Exchange1 using -DA64FX and in Lattice_base.h building Dslash only
2020-04-19 01:25:40 +02:00
nils meyer
6fdce60492
revised BodyA64FX; 990 GiB/s Wilson, 687 GiB/s DW using intrinsics (armclang 20.0)
2020-04-16 22:43:32 +02:00
Christoph Lehner
327da332bb
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/gpt
2020-04-16 11:30:17 -04:00
nils meyer
6504a098cc
999 GiB/s Wilson; 694 GiB/s DW (DP)
2020-04-15 15:06:52 +02:00
nils meyer
c12a67030a
980 GiB/s Wilson; 680 GiB/s DW (DP)
2020-04-15 10:55:06 +02:00
nils meyer
581392f2f2
now with pf, best results so far using intrinsics+pf
2020-04-12 22:06:14 +02:00
nils meyer
113f277b6a
enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl
2020-04-11 04:55:01 +02:00
nils meyer
974586bedc
Dslash finally works; cleaned up; uses MOVPRFX in assembly
2020-04-10 22:26:40 +02:00
Peter Boyle
8e81a811d0
Merge branch 'feature/hdcr' into develop
2020-04-10 11:14:49 -04:00
nmeyer-ur
160f78c1e4
changed debug output to variable direct 3
2020-04-10 12:23:07 +02:00
nmeyer-ur
7e4e1bbbc2
changed debug output to variable direct 2
2020-04-10 12:22:04 +02:00
nmeyer-ur
e699b7e9f9
changed debug output to variable direct
2020-04-10 12:18:30 +02:00
nmeyer-ur
a28bc0de90
debug register address test in WilsonHand
2020-04-10 12:07:45 +02:00
nmeyer-ur
14d0fe4d6c
added predication in WilsonHand
2020-04-10 12:04:00 +02:00
nmeyer-ur
0ad2e0815c
debug output in WilsonHand
2020-04-10 11:56:29 +02:00
nils meyer
dc9c8340bb
switched to DSLASHINTRIN for A64FX Dslash intrinsics
2020-04-09 23:30:23 +02:00
nils meyer
19eef97503
specialized A64FX Dslash kernels
2020-04-09 23:25:25 +02:00
nils meyer
5cdbb7e71e
fixed A64FX Dslash; compiles, but does not specialize -> assertion
2020-04-09 21:23:39 +02:00
nmeyer-ur
86c9c4da8b
changes
2020-04-09 16:40:06 +02:00
nmeyer-ur
bd310932f7
changes
2020-04-09 16:32:31 +02:00
nmeyer-ur
77fa586f6c
introduced A64FX Wilson kernels
2020-04-09 13:30:06 +02:00
2c22db841a
Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions
2020-04-02 17:38:47 +01:00
Christoph Lehner
a2188ea875
remove debugging printf from WilsonKernelsImplementation
2020-03-26 09:12:36 -04:00
Christoph Lehner
c9b737a4e7
make trace,adj,transpose unary operators
2020-03-16 17:58:30 -04:00
Peter Boyle
7c061e20c9
All directions of dirac operator for fastt coarsening
2020-01-27 12:40:13 -05:00
Peter Boyle
e5d1c09665
Faster DhopDirAll for little dirac operator coarsening
2020-01-27 12:38:54 -05:00
Peter Boyle
8016a465ae
Remove extraneous variable
2020-01-27 12:35:37 -05:00
Peter Boyle
d8b9742092
DhopDirAll for faster matrix elements of little Dirac operator
2020-01-27 12:34:54 -05:00
Christopher Kelly
96671bbb24
Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
...
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
Peter Boyle
e583035614
Change to interface to minise comms in evaluating coarse space operator
2020-01-06 11:43:59 -05:00
Peter Boyle
3c3d6a94f3
OPtimising the force term a bit
2020-01-04 03:16:23 -05:00
Peter Boyle
039eb7b2eb
Make the force term and coarsening multigrid more optimised
2020-01-04 03:12:17 -05:00
gfilaci
f7373e97a4
Missing conjugate in MooeeInvDag
2019-12-16 10:05:50 +01:00
Peter Boyle
848079e8ba
Merge pull request #235 from grid-test-organisation/feature/5d-improvement
...
MooeeInv and M5D optimisations + enable threading with nvcc
2019-12-10 21:45:03 -05:00
Peter Boyle
9b6b0caa55
Junk commit fix
2019-12-09 03:01:58 -05:00
Peter Boyle
2a48617ac5
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2019-12-09 03:00:00 -05:00
Peter Boyle
3d2fe80780
Temporary size depends on checkerboard/uncheckerboard. The Mdir cares
2019-12-09 02:58:24 -05:00
Michael Marshall
803329af99
Merge branch 'develop' into feature/distil
...
* develop:
Fix after GPU merge: Phase in Free Propagator
z2-momentum phase module
# Conflicts:
# Hadrons/Modules/MSource/MomentumPhase.hpp
2019-10-07 13:09:52 +01:00
5f22810f55
Fix after GPU merge: Phase in Free Propagator
2019-10-02 14:49:35 +01:00
Michael Marshall
2e963d1a78
Fix location of Grid.h and remove reference to QCD namespace
2019-09-16 15:34:47 +01:00
gfilaci
a7fa86dc29
MooeeInv improvement for DW EOFA + comments
2019-09-05 12:05:21 +01:00
gfilaci
fdd9b14e82
speed up MooeeInvDag for DWF EOFA
2019-09-02 14:49:51 +01:00
gfilaci
e66669d300
fast MooeeInv for EOFA
2019-09-02 14:26:13 +01:00
gfilaci
0efaf3c4fa
access M5D coeffs through pointers
2019-09-02 11:33:00 +01:00
gfilaci
3ef519aaa4
fast MooeeInv
2019-09-02 11:18:14 +01:00
Peter Boyle
e279b2be29
Merge develop
2019-08-14 23:01:59 +01:00
Peter Boyle
48e6efc7c9
Merge branch 'develop' into feature/gpu-port
...
Conflicts:
Grid/qcd/action/fermion/WilsonKernelsAsm.cc
Grid/qcd/action/fermion/implementation/ImprovedStaggeredFermionImplementation.h
Grid/qcd/action/fermion/implementation/StaggeredKernelsAsm.h
benchmarks/Benchmark_comms.cc
2019-08-14 18:56:54 +01:00
Peter Boyle
53e3ab4131
Fix force term
2019-08-11 11:06:13 +01:00
Peter Boyle
8c6016f717
Merge pull request #219 from mmphys/feature/include
...
Housekeeping. #include <Grid.h> ---> #include <Grid/Grid.h>
2019-07-29 23:08:01 +01:00