1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-04-07 12:45:55 +01:00

141 Commits

Author SHA1 Message Date
nils meyer
852db4626a re-introduced HOTFIX cause Grid binaries give wrong results otherwise; checked in good gridverter.py 2020-04-15 18:22:19 +02:00
nils meyer
6504a098cc 999 GiB/s Wilson; 694 GiB/s DW (DP) 2020-04-15 15:06:52 +02:00
nils meyer
79a385faca disabled armclang hotfix cause armclang 20.0 performance gets a little 2020-04-15 11:46:55 +02:00
nils meyer
c12a67030a 980 GiB/s Wilson; 680 GiB/s DW (DP) 2020-04-15 10:55:06 +02:00
nils meyer
581392f2f2 now with pf, best results so far using intrinsics+pf 2020-04-12 22:06:14 +02:00
nils meyer
113f277b6a enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl 2020-04-11 04:55:01 +02:00
nils meyer
974586bedc Dslash finally works; cleaned up; uses MOVPRFX in assembly 2020-04-10 22:26:40 +02:00
nils meyer
5cdbb7e71e fixed A64FX Dslash; compiles, but does not specialize -> assertion 2020-04-09 21:23:39 +02:00
nmeyer-ur
8123590a1b changes 2020-04-09 16:45:47 +02:00
nmeyer-ur
cd1efee866 changes 2020-04-09 16:35:13 +02:00
nmeyer-ur
bd310932f7 changes 2020-04-09 16:32:31 +02:00
nmeyer-ur
e252c1aca3 addressing 2020-04-09 15:03:12 +02:00
nmeyer-ur
b140c6a4f9 addressing 2020-04-09 15:01:15 +02:00
nmeyer-ur
326de36467 revised sU addressing scheme 2020-04-09 14:44:25 +02:00
nmeyer-ur
9f224a1647 fixed typo in single 2020-04-09 14:30:21 +02:00
nmeyer-ur
bb46ba9b5f fixed array size in single 2020-04-09 14:28:45 +02:00
nmeyer-ur
dd5a22b36b revised declarations 2020-04-09 14:21:27 +02:00
nmeyer-ur
1ea85b9972 Disabled build message 2020-04-09 13:47:21 +02:00
nmeyer-ur
8fb63f1c25 added A64FX Wilson kernels single precision 2020-04-09 13:41:04 +02:00
nmeyer-ur
77fa586f6c introduced A64FX Wilson kernels 2020-04-09 13:30:06 +02:00
nmeyer-ur
15238e8d5e reduce acle works, clean up 2020-04-03 20:40:44 +02:00
nmeyer-ur
b27e31957a reduce acle revised 2020-04-03 19:46:15 +02:00
nmeyer-ur
46927771e3 reduce acle still needs overhaul 2020-04-03 19:30:48 +02:00
nmeyer-ur
d8cea77707 define simd width in header 2020-04-03 19:22:25 +02:00
nmeyer-ur
5f8a76d490 clean up, reduction in acle 2020-04-03 19:18:24 +02:00
nmeyer-ur
28d49a3b60 build problem resolved 2020-04-03 16:52:48 +02:00
nmeyer-ur
b4c624ece6 added A64FX support 2020-04-03 15:43:23 +02:00
Peter Boyle
55cdb17691 Integer divide for blocking 2020-01-27 12:27:45 -05:00
Peter Boyle
34108296cd Merge branch 'develop' into feature/gpu-port
Conflicts:
	Grid/simd/Grid_avx512.h
2019-07-20 17:05:35 +01:00
Peter Boyle
76c704b84b Intrinsics for CLANG are now fixed in v6 2019-07-20 16:52:24 +01:00
Peter Boyle
a23dc295ac Remove compiler errors and warnings 2019-07-18 14:47:02 +01:00
Peter Boyle
fa9cd50c5b Merge branch 'develop' into feature/gpu-port 2019-07-16 11:55:17 +01:00
Peter Boyle
703dc20377 Compile tests fix 2019-06-16 13:59:29 +01:00
Peter Boyle
c7dbf4c87e Scalar support for GPU threads 2019-06-15 08:25:43 +01:00
gfilaci
8b6541fb60 Fix gpu MultRealPart and MaddRealPart bug 2019-05-02 10:58:17 +01:00
91cffef883 Updates after review with Peter. 2019-03-07 14:30:35 +00:00
b7db99967a Recommendations for Traits classes 2019-02-28 20:06:59 +00:00
Peter Boyle
e73b909a48 Make tests running past nvcc. Different NVCC versions proving tricky to keep happy. This is 9.2 2019-01-02 12:05:30 +00:00
Peter Boyle
9d866d062a GPU support improvements 2019-01-01 15:05:03 +00:00
Peter Boyle
b57a4d32aa Merge branch 'develop' into feature/gpu-port 2018-12-13 05:11:34 +00:00
fb7d021b9d Hadrons: moving Hadrons to root directory, build system improvements 2018-08-28 15:00:40 +01:00