1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-07 12:45:55 +01:00

552 Commits

Author SHA1 Message Date
nils meyer
c12a67030a 980 GiB/s Wilson; 680 GiB/s DW (DP) 2020-04-15 10:55:06 +02:00
nils meyer
581392f2f2 now with pf, best results so far using intrinsics+pf 2020-04-12 22:06:14 +02:00
nils meyer
113f277b6a enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl 2020-04-11 04:55:01 +02:00
nils meyer
974586bedc Dslash finally works; cleaned up; uses MOVPRFX in assembly 2020-04-10 22:26:40 +02:00
Peter Boyle
8e81a811d0 Merge branch 'feature/hdcr' into develop 2020-04-10 11:14:49 -04:00
nmeyer-ur
160f78c1e4 changed debug output to variable direct 3 2020-04-10 12:23:07 +02:00
nmeyer-ur
7e4e1bbbc2 changed debug output to variable direct 2 2020-04-10 12:22:04 +02:00
nmeyer-ur
e699b7e9f9 changed debug output to variable direct 2020-04-10 12:18:30 +02:00
nmeyer-ur
a28bc0de90 debug register address test in WilsonHand 2020-04-10 12:07:45 +02:00
nmeyer-ur
14d0fe4d6c added predication in WilsonHand 2020-04-10 12:04:00 +02:00
nmeyer-ur
0ad2e0815c debug output in WilsonHand 2020-04-10 11:56:29 +02:00
nils meyer
dc9c8340bb switched to DSLASHINTRIN for A64FX Dslash intrinsics 2020-04-09 23:30:23 +02:00
nils meyer
19eef97503 specialized A64FX Dslash kernels 2020-04-09 23:25:25 +02:00
nils meyer
5cdbb7e71e fixed A64FX Dslash; compiles, but does not specialize -> assertion 2020-04-09 21:23:39 +02:00
nmeyer-ur
86c9c4da8b changes 2020-04-09 16:40:06 +02:00
nmeyer-ur
bd310932f7 changes 2020-04-09 16:32:31 +02:00
nmeyer-ur
77fa586f6c introduced A64FX Wilson kernels 2020-04-09 13:30:06 +02:00
2c22db841a Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions 2020-04-02 17:38:47 +01:00
Christoph Lehner
b6cbdd2aa3
Merge pull request #1 from DanielRichtmann/feature/read-openqcd
Feature/read openqcd
2020-03-26 17:39:04 +01:00
Christoph Lehner
a2188ea875 remove debugging printf from WilsonKernelsImplementation 2020-03-26 09:12:36 -04:00
Daniel Richtmann
989af65807
Check in parallel reader for openqcd configs 2020-03-24 11:20:54 +01:00
Christoph Lehner
c9b737a4e7 make trace,adj,transpose unary operators 2020-03-16 17:58:30 -04:00
Daniel Richtmann
037bb6ea73
Check in reader for openqcd configs
This reader is suboptimal in the sense that it opens the entire config on every MPI rank.
2020-03-16 14:28:02 +01:00
Peter Boyle
7c061e20c9 All directions of dirac operator for fastt coarsening 2020-01-27 12:40:13 -05:00
Peter Boyle
e5d1c09665 Faster DhopDirAll for little dirac operator coarsening 2020-01-27 12:38:54 -05:00
Peter Boyle
8016a465ae Remove extraneous variable 2020-01-27 12:35:37 -05:00
Peter Boyle
d8b9742092 DhopDirAll for faster matrix elements of little Dirac operator 2020-01-27 12:34:54 -05:00
Christopher Kelly
96671bbb24 Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
Peter Boyle
e583035614 Change to interface to minise comms in evaluating coarse space operator 2020-01-06 11:43:59 -05:00
Peter Boyle
3c3d6a94f3 OPtimising the force term a bit 2020-01-04 03:16:23 -05:00
Peter Boyle
039eb7b2eb Make the force term and coarsening multigrid more optimised 2020-01-04 03:12:17 -05:00
gfilaci
f7373e97a4 Missing conjugate in MooeeInvDag 2019-12-16 10:05:50 +01:00
Peter Boyle
848079e8ba
Merge pull request #235 from grid-test-organisation/feature/5d-improvement
MooeeInv and M5D optimisations + enable threading with nvcc
2019-12-10 21:45:03 -05:00
David Murphy
4180a4a8a7 Import BiCGSTAB solvers and tests 2019-12-10 17:20:35 -05:00
6446671a9c
Merge pull request #241 from nils-asmussen/fix/remQCDns_ignore_ws
Undo whitespace changes in fix/removeQCDremnants to allow comparing relevant changes
2019-12-09 18:02:21 +00:00
Peter Boyle
9b6b0caa55 Junk commit fix 2019-12-09 03:01:58 -05:00
Peter Boyle
2a48617ac5 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2019-12-09 03:00:00 -05:00
Peter Boyle
3d2fe80780 Temporary size depends on checkerboard/uncheckerboard. The Mdir cares 2019-12-09 02:58:24 -05:00
ferben
f7698b93ca corrected comments about quark line directions 2019-12-06 09:46:52 +00:00
ferben
a54157e682 more definitions changed 2019-12-05 17:08:09 +00:00
ferben
b766038810 new syntax after merge 2019-12-04 18:08:00 +00:00
ferben
cd9fd80a5d merged in develop 2019-12-04 17:12:46 +00:00
ferben
e940f4db7e removed unused parameter parity 2019-12-03 12:01:31 +00:00
Michael Marshall
7983ff2fdd Merge branch 'develop' into feature/distil
* develop:
  Change to reporting
  NVCC timer support
  Fix nocompilee under NVCC
  --enable-summit flag
  IBM summit optimisation. Synchronise in node is still btweeen 2 halves of AC922, so could be a little faster
  Sliced propagator contraction was not producing any results because buf.size()=0
  several typos in hadrons
2019-11-30 16:47:03 +00:00
Michael Marshall
2db814f2b7 Resolve conflicts in BaryonUtils (just use latest from develop) 2019-11-29 18:19:35 +00:00
799ff0c96e speed-up 2019-11-26 15:28:47 +00:00
5fd5c25114 now two seperate functions for Eye and NonEye 2019-11-26 13:44:55 +00:00
Peter Boyle
feb1ff3494 Fix nocompilee under NVCC 2019-11-21 20:03:39 +00:00
ferben
421a4395af Sigma to Nucleon contractions 2019-11-21 17:25:37 +00:00
Michael Marshall
22c654182a Fixes for GPU compile 2019-11-04 17:24:34 +00:00