nils meyer
|
64b72fc17f
|
testing gcc 10.0.1: build errors in Exchange1 using -DA64FX and in Lattice_base.h building Dslash only
|
2020-04-19 01:25:40 +02:00 |
|
nils meyer
|
6fdce60492
|
revised BodyA64FX; 990 GiB/s Wilson, 687 GiB/s DW using intrinsics (armclang 20.0)
|
2020-04-16 22:43:32 +02:00 |
|
nils meyer
|
6504a098cc
|
999 GiB/s Wilson; 694 GiB/s DW (DP)
|
2020-04-15 15:06:52 +02:00 |
|
nils meyer
|
c12a67030a
|
980 GiB/s Wilson; 680 GiB/s DW (DP)
|
2020-04-15 10:55:06 +02:00 |
|
nils meyer
|
581392f2f2
|
now with pf, best results so far using intrinsics+pf
|
2020-04-12 22:06:14 +02:00 |
|
nils meyer
|
113f277b6a
|
enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl
|
2020-04-11 04:55:01 +02:00 |
|
nils meyer
|
974586bedc
|
Dslash finally works; cleaned up; uses MOVPRFX in assembly
|
2020-04-10 22:26:40 +02:00 |
|
nils meyer
|
5cdbb7e71e
|
fixed A64FX Dslash; compiles, but does not specialize -> assertion
|
2020-04-09 21:23:39 +02:00 |
|
nmeyer-ur
|
8123590a1b
|
changes
|
2020-04-09 16:45:47 +02:00 |
|
nmeyer-ur
|
cd1efee866
|
changes
|
2020-04-09 16:35:13 +02:00 |
|
nmeyer-ur
|
bd310932f7
|
changes
|
2020-04-09 16:32:31 +02:00 |
|
nmeyer-ur
|
e252c1aca3
|
addressing
|
2020-04-09 15:03:12 +02:00 |
|
nmeyer-ur
|
b140c6a4f9
|
addressing
|
2020-04-09 15:01:15 +02:00 |
|
nmeyer-ur
|
326de36467
|
revised sU addressing scheme
|
2020-04-09 14:44:25 +02:00 |
|
nmeyer-ur
|
9f224a1647
|
fixed typo in single
|
2020-04-09 14:30:21 +02:00 |
|
nmeyer-ur
|
bb46ba9b5f
|
fixed array size in single
|
2020-04-09 14:28:45 +02:00 |
|
nmeyer-ur
|
dd5a22b36b
|
revised declarations
|
2020-04-09 14:21:27 +02:00 |
|
nmeyer-ur
|
8fb63f1c25
|
added A64FX Wilson kernels single precision
|
2020-04-09 13:41:04 +02:00 |
|