Yong-Chull Jang
|
53a9260a94
|
patch to compile with AVX512 for SkyLake Xeon processor using GCC7.2.0. Beside bug fixes in the source code, a option 'SKL' is added to configure.ac for SkyLake processor specific AVX512 instruction flags when using GCC. Code can be compiled with --enable-simd=SKL using GCC 7.2.0, but Test_simd fails. AVX512 support for complex double type with non-intel compilers makes this error.
|
2018-01-27 10:00:38 -05:00 |
|
Lanny91
|
56abbdf4c2
|
AVX512 integer reduce fix (for non-intel compiler)
|
2017-06-23 11:09:14 +02:00 |
|
Lanny91
|
a833f88c32
|
Added missing SIMD integer reduction implementation for AVX, AVX-512, SSE4, IMCI
|
2017-06-16 15:58:47 +01:00 |
|
paboyle
|
3ca41458a3
|
Fix to no USE_FP16 case
|
2017-04-14 14:20:54 +01:00 |
|
Peter Boyle
|
951be75292
|
Half precision conversion working on AVX512 now too
|
2017-04-13 17:35:11 +01:00 |
|
Peter Boyle
|
b9113ed310
|
Patches for knl
|
2017-04-13 12:02:12 -04:00 |
|
paboyle
|
1d502e4ed6
|
FP16 optional compile time
|
2017-04-13 11:55:24 +01:00 |
|
paboyle
|
68392ddb5b
|
Exchange in generic
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
|
2017-04-13 08:38:12 +01:00 |
|
paboyle
|
cb6b81ae82
|
Half precision conversion
|
2017-04-12 19:32:37 +01:00 |
|
paboyle
|
661fc4d3d1
|
Debug AVX512 exchange code paths
|
2017-02-20 17:48:36 -05:00 |
|
paboyle
|
bd600702cf
|
Vectorise the XYZT face gathering better.
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
|
2017-02-15 11:11:04 +00:00 |
|
Peter Boyle
|
69ae817d1c
|
Updates for supporting Mobius better
|
2016-12-08 16:43:28 +00:00 |
|
Peter Boyle
|
e27c6b217c
|
Updating
|
2016-12-01 12:42:53 +00:00 |
|
|
97cddda49e
|
Merge branch 'feature/gen-simd' into feature/doxygen
# Conflicts:
# Makefile.am
# configure.ac
|
2016-11-19 13:11:13 +01:00 |
|
azusayamaguchi
|
f7b60004f3
|
Merge branch 'develop' into release/v0.6.0
|
2016-11-04 16:08:07 +00:00 |
|
|
d5e95bc350
|
Merge branch 'release/v0.6.0' into feature/feynman-rules
|
2016-10-31 18:36:21 +00:00 |
|
Guido Cossu
|
e1042aef77
|
First version of the doube prec for testing purposes
It does not compile single and double version at the same time
|
2016-10-28 17:20:04 +01:00 |
|
paboyle
|
aa6a839c60
|
avx512 build fix; detect clang/gcc intrinsics vs. ICPC
|
2016-10-28 09:13:09 +01:00 |
|
|
ca21003f01
|
Merge branch 'feature/fft-opt' into feature/feynman-rules
# Conflicts:
# lib/FFT.h
# lib/qcd/action/fermion/WilsonFermion5D.h
# tests/core/Test_fft.cc
|
2016-10-26 18:44:47 +01:00 |
|
azusayamaguchi
|
460d0753a1
|
Merge branch 'develop' into feature/mpi3
Conflicts:
lib/simd/Grid_avx512.h
|
2016-10-25 01:08:51 +01:00 |
|
azusayamaguchi
|
75ebd3a0d1
|
Typo fixes and rotate for CLANG
|
2016-10-21 22:34:29 +01:00 |
|
|
bd6a228af6
|
Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules
|
2016-10-21 13:10:30 +01:00 |
|
azusayamaguchi
|
20a091c3ed
|
Intel vs. Clang intrinsics differences absorbed
|
2016-10-21 09:08:36 +01:00 |
|
|
997fd882ff
|
Merge branch 'develop' into feature/feynman-rules
# Conflicts:
# lib/Threads.h
# lib/qcd/action/fermion/WilsonFermion.cc
# lib/qcd/action/fermion/WilsonFermion.h
# lib/qcd/utils/SUn.h
# lib/simd/Grid_avx.h
# lib/simd/Intel512common.h
|
2016-10-19 18:35:18 +01:00 |
|
paboyle
|
811ca45473
|
GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support
|
2016-10-17 16:23:21 +01:00 |
|
paboyle
|
836e929565
|
Divide handling improved
|
2016-09-26 09:42:22 +01:00 |
|
|
b1cfb4d661
|
first try at a nicer Doxygen implementation
|
2016-08-05 15:29:18 +01:00 |
|
paboyle
|
587f80cd93
|
Updated to compile and pass under intel SDE
|
2016-04-19 15:13:54 -07:00 |
|
paboyle
|
e5657510b0
|
Rotate support for Ls simd-ized
|
2016-04-19 22:24:18 +01:00 |
|
paboyle
|
ad80f61fba
|
AVX512 shaken out
|
2016-03-28 00:38:05 -06:00 |
|
paboyle
|
644fd6d32e
|
Build avx512 clean
|
2016-03-25 09:35:33 -07:00 |
|
paboyle
|
aae8bf31a7
|
Global edit adding copyright and license info to every source file.
|
2016-01-02 14:51:32 +00:00 |
|
paboyle
|
a38762159c
|
Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.
Updated Grid_avx512.h
|
2015-11-04 03:09:06 -08:00 |
|
Peter Boyle
|
64d64d1ab6
|
Updating to modify non-inlining permute routines and hopefully get better reg use and
enhance performance.
|
2015-09-25 08:55:04 -07:00 |
|
Peter Boyle
|
5ef42add2d
|
Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
and drop swizzles in AVX512. Don't know why these compiled.
|
2015-09-23 05:23:45 -07:00 |
|
Peter Boyle
|
4deffd1ccb
|
No compile fix
|
2015-07-02 02:03:09 +01:00 |
|
neo
|
48bf4878c1
|
Experimental support for ARM
|
2015-06-09 15:46:21 +09:00 |
|
Peter Boyle
|
62a7ca462f
|
Works now with Clang-avx, Clang-sse and ICPC-avx, ICPC-sse
|
2015-05-28 11:35:43 +01:00 |
|