bc1f5be265
Merge branch 'dev-IRBL-ypj' of https://github.com/yongchull/Grid into merge
2018-03-08 18:02:06 -05:00
53a9260a94
patch to compile with AVX512 for SkyLake Xeon processor using GCC7.2.0. Beside bug fixes in the source code, a option 'SKL' is added to configure.ac for SkyLake processor specific AVX512 instruction flags when using GCC. Code can be compiled with --enable-simd=SKL using GCC 7.2.0, but Test_simd fails. AVX512 support for complex double type with non-intel compilers makes this error.
2018-01-27 10:00:38 -05:00
3cb8cb7282
'typename' is added to compile with AVX512 using GCC7.2.0; a semicolon was missing in Grid_avx512.h and the bug is fixed. Option SKL is added to configure script for skylake processor specific AVX512 operations. Code can be compiled with --enable-simd=SKL using GCC 7.2.0, but Test_simd fails. AVX512 support for complex double type with non-intel compilers makes this error; it needs a review.
2017-12-23 14:54:07 -05:00
fc4ab9ccd5
Working half precision comms
2017-04-20 11:20:26 +01:00
9fd23faadf
Pretty layout
2017-03-30 13:44:45 +09:00
4e7ab3166f
Refactoring header layout
2017-02-22 18:09:33 +00:00
3ae92fa2e6
Global changes to parallel_for structure.
...
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
3e6945cd65
Fixing AVX Z-mobius
2016-12-18 02:05:11 +00:00
87be03006a
AVX 512 code broke other compiles; fixing
2016-12-18 01:45:09 +00:00
fa6acccf55
Zmobius asm
2016-12-18 00:56:19 +00:00
fe187e9ed3
Compiles and passes under ZMobius with assembler
2016-12-10 00:47:48 +00:00
0091b50f49
Zmobius working -- not asm yet
2016-12-09 22:51:32 +00:00
fb8d4b2357
Lots of debug on performance Mobius
2016-12-08 17:28:28 +00:00
e27c6b217c
Updating
2016-12-01 12:42:53 +00:00
6adf35da54
Faster Mobius
2016-12-01 11:39:04 +00:00
bd0430b34f
Serialisation in malloc fixed
2016-11-29 22:27:55 +00:00
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
980ff18956
Solving the instantiation no compile issue
2016-07-15 17:19:44 +01:00
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00