|
64b72fc17f
|
testing gcc 10.0.1: build errors in Exchange1 using -DA64FX and in Lattice_base.h building Dslash only
|
2020-04-19 01:25:40 +02:00 |
|
|
6fdce60492
|
revised BodyA64FX; 990 GiB/s Wilson, 687 GiB/s DW using intrinsics (armclang 20.0)
|
2020-04-16 22:43:32 +02:00 |
|
|
6504a098cc
|
999 GiB/s Wilson; 694 GiB/s DW (DP)
|
2020-04-15 15:06:52 +02:00 |
|
|
c12a67030a
|
980 GiB/s Wilson; 680 GiB/s DW (DP)
|
2020-04-15 10:55:06 +02:00 |
|
|
581392f2f2
|
now with pf, best results so far using intrinsics+pf
|
2020-04-12 22:06:14 +02:00 |
|
|
113f277b6a
|
enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl
|
2020-04-11 04:55:01 +02:00 |
|
|
974586bedc
|
Dslash finally works; cleaned up; uses MOVPRFX in assembly
|
2020-04-10 22:26:40 +02:00 |
|
|
5cdbb7e71e
|
fixed A64FX Dslash; compiles, but does not specialize -> assertion
|
2020-04-09 21:23:39 +02:00 |
|
|
cd1efee866
|
changes
|
2020-04-09 16:35:13 +02:00 |
|
|
bd310932f7
|
changes
|
2020-04-09 16:32:31 +02:00 |
|
|
e252c1aca3
|
addressing
|
2020-04-09 15:03:12 +02:00 |
|
|
b140c6a4f9
|
addressing
|
2020-04-09 15:01:15 +02:00 |
|
|
326de36467
|
revised sU addressing scheme
|
2020-04-09 14:44:25 +02:00 |
|
|
dd5a22b36b
|
revised declarations
|
2020-04-09 14:21:27 +02:00 |
|
|
77fa586f6c
|
introduced A64FX Wilson kernels
|
2020-04-09 13:30:06 +02:00 |
|