1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-15 02:05:37 +00:00
Commit Graph

6426 Commits

Author SHA1 Message Date
nmeyer-ur
09f0963d1f changes in configure.ac ; to be verified 2020-04-23 11:27:03 +02:00
nils meyer
6f44e3c192 reverted changes in configure.ac ; included SVE configure readme 2020-04-23 11:18:50 +02:00
Peter Boyle
c2c3cad20d Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2020-04-23 04:35:42 -04:00
Peter Boyle
edec9ee2e2 Conserved current rewrite done. Zmobius working 2020-04-23 04:34:01 -04:00
Peter Boyle
ed70cce542 Test for 5D DWF obserevables 2020-04-23 04:29:45 -04:00
Michael Marshall
4701201b5f grid-config: Expose CXXLD (for GPU build) and update help 2020-04-22 18:42:30 +01:00
nils meyer
5893888f87 removed default no-strict-aliasing for gcc-10.0.1 exclusively 2020-04-22 19:29:55 +02:00
nmeyer-ur
39b448affb Merge remote-tracking branch 'origin/develop' into feature/a64fx-2 2020-04-22 17:34:12 +02:00
nils meyer
e54a8f05a9 Exchange1 with generic version for now, should use svtbl2 in final version 2020-04-20 22:45:27 +02:00
Peter Boyle
0782b76ed4
Merge pull request #274 from paboyle/feature/zmobius_paramcompute
ZMobius parameter computation
2020-04-20 14:39:29 -04:00
Christopher Kelly
0896f2cead Added missing include guards in bigfloat_double.h 2020-04-20 10:30:38 -04:00
Christopher Kelly
181709bba4 Merge branch 'develop' into feature/zmobius_paramcompute 2020-04-20 09:12:34 -04:00
nils meyer
64b72fc17f testing gcc 10.0.1: build errors in Exchange1 using -DA64FX and in Lattice_base.h building Dslash only 2020-04-19 01:25:40 +02:00
Christoph Lehner
091d5c605e towards more precise blocking 2020-04-17 04:25:28 -04:00
nils meyer
6fdce60492 revised BodyA64FX; 990 GiB/s Wilson, 687 GiB/s DW using intrinsics (armclang 20.0) 2020-04-16 22:43:32 +02:00
Peter Boyle
90229cfb0f
Merge pull request #270 from milc-qcd/feature/CGinfo
feature/CGinfo
2020-04-16 11:46:08 -04:00
Peter Boyle
0475c46ecb
Merge pull request #256 from djm2131/feature/BiCGSTAB
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
Peter Boyle
3cca10e617
Merge pull request #276 from nils-asmussen/fix/regression_nt
fix regression in tests/core/Test_qed.cc
2020-04-16 11:42:39 -04:00
Christoph Lehner
327da332bb Merge branch 'develop' of https://github.com/paboyle/Grid into feature/gpt 2020-04-16 11:30:17 -04:00
nils meyer
852db4626a re-introduced HOTFIX cause Grid binaries give wrong results otherwise; checked in good gridverter.py 2020-04-15 18:22:19 +02:00
43dc2814dd fix regression in core/Test_qed.cc 2020-04-15 16:10:15 +01:00
nils meyer
6504a098cc 999 GiB/s Wilson; 694 GiB/s DW (DP) 2020-04-15 15:06:52 +02:00
nils meyer
79a385faca disabled armclang hotfix cause armclang 20.0 performance gets a little 2020-04-15 11:46:55 +02:00
nils meyer
c12a67030a 980 GiB/s Wilson; 680 GiB/s DW (DP) 2020-04-15 10:55:06 +02:00
nils meyer
581392f2f2 now with pf, best results so far using intrinsics+pf 2020-04-12 22:06:14 +02:00
nils meyer
113f277b6a enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl 2020-04-11 04:55:01 +02:00
Peter Boyle
f3a8d039a2 Merge branch 'feature/hdcr' into develop 2020-04-10 22:01:52 -04:00
nils meyer
974586bedc Dslash finally works; cleaned up; uses MOVPRFX in assembly 2020-04-10 22:26:40 +02:00
4e864e56c9 develop pull 2020-04-10 17:19:18 +01:00
Peter Boyle
014dbfa464 Compile fix with OpDirAll 2020-04-10 11:57:09 -04:00
Peter Boyle
3b0e07882f Adding another form of polynomial 2020-04-10 11:28:33 -04:00
Peter Boyle
8e81a811d0 Merge branch 'feature/hdcr' into develop 2020-04-10 11:14:49 -04:00
Peter Boyle
aa13118127 Missing conjugate already fixed in develop 2020-04-10 11:11:24 -04:00
Peter Boyle
6cdb09c884 Faster copy region 2020-04-10 11:10:52 -04:00
Peter Boyle
a65bc64f10 Accelerator peek poke 2020-04-10 11:09:59 -04:00
Peter Boyle
11dec4883c Don't throw assert 2020-04-10 11:09:11 -04:00
Peter Boyle
afa458c812 Extra solvers 2020-04-10 11:08:19 -04:00
Peter Boyle
dc50190b8f Faster GPU basis rotation
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
nmeyer-ur
160f78c1e4 changed debug output to variable direct 3 2020-04-10 12:23:07 +02:00
nmeyer-ur
7e4e1bbbc2 changed debug output to variable direct 2 2020-04-10 12:22:04 +02:00
nmeyer-ur
e699b7e9f9 changed debug output to variable direct 2020-04-10 12:18:30 +02:00
nmeyer-ur
a28bc0de90 debug register address test in WilsonHand 2020-04-10 12:07:45 +02:00
nmeyer-ur
14d0fe4d6c added predication in WilsonHand 2020-04-10 12:04:00 +02:00
nmeyer-ur
0ad2e0815c debug output in WilsonHand 2020-04-10 11:56:29 +02:00
nils meyer
1c8ca05e16 Merge branch 'feature/a64fx-2' of https://github.com/nmeyer-ur/Grid into feature/a64fx-2 2020-04-09 23:32:19 +02:00
nils meyer
dc9c8340bb switched to DSLASHINTRIN for A64FX Dslash intrinsics 2020-04-09 23:30:23 +02:00
nils meyer
19eef97503 specialized A64FX Dslash kernels 2020-04-09 23:25:25 +02:00
nmeyer-ur
635246ce50 corrected typo 2020-04-09 21:42:50 +02:00
nils meyer
5cdbb7e71e fixed A64FX Dslash; compiles, but does not specialize -> assertion 2020-04-09 21:23:39 +02:00
nmeyer-ur
8123590a1b changes 2020-04-09 16:45:47 +02:00