paboyle
661fc4d3d1
Debug AVX512 exchange code paths
2017-02-20 17:48:36 -05:00
paboyle
f246fe3304
Improvements to avx for invertible to avoid latent bug
2017-02-16 23:52:44 +00:00
paboyle
bd600702cf
Vectorise the XYZT face gathering better.
...
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
2017-02-15 11:11:04 +00:00
paboyle
0883d6a7ce
Overlap comms compute support; make reg naming consistent with bgq aasm
2017-02-07 00:59:32 -05:00
paboyle
4bbdfb434c
Overlap comms compute modifications
2017-02-07 00:57:01 -05:00
Peter Boyle
c3b6d573b9
Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm
2016-12-30 22:42:17 +00:00
Peter Boyle
1e179c903d
Worried about integer; suspect where statements are broken
2016-12-27 17:46:38 +00:00
Peter Boyle
1caa3fbc2d
LOCK UNLOCK only
2016-12-27 11:24:45 +00:00
Peter Boyle
eabf316ed9
BGQ performance ASM
2016-12-22 21:56:08 +00:00
Peter Boyle
7dc36628a1
QPX finishing
2016-12-22 17:50:48 +00:00
paboyle
3f2d53a994
BGQ assembler beginning
2016-12-20 10:21:26 +00:00
paboyle
4b220972ac
Warning fix
2016-12-18 02:14:17 +00:00
paboyle
629f43e36c
Return statement needed
2016-12-18 02:09:37 +00:00
paboyle
a3172b3455
Precision error
2016-12-18 02:07:45 +00:00
paboyle
f17436fec2
Bad commit fixed
2016-12-18 01:27:34 +00:00
Peter Boyle
69ae817d1c
Updates for supporting Mobius better
2016-12-08 16:43:28 +00:00
Peter Boyle
e27c6b217c
Updating
2016-12-01 12:42:53 +00:00
paboyle
6adf35da54
Faster Mobius
2016-12-01 11:39:04 +00:00
Lanny91
b18950f776
Added simd real divide test with QPX divide fixes
2016-11-25 13:21:33 +00:00
Lanny91
0acbf77bc6
Add QPX Div structure
2016-11-24 13:24:12 +00:00
a2cffb0304
AVXFMA target fixed
2016-11-21 17:47:18 +01:00
97cddda49e
Merge branch 'feature/gen-simd' into feature/doxygen
...
# Conflicts:
# Makefile.am
# configure.ac
2016-11-19 13:11:13 +01:00
b873504b90
fully generic SIMD
2016-11-19 01:32:39 +01:00
042ae5b87c
generic 256bits SIMD
2016-11-15 12:16:15 +00:00
azusayamaguchi
f7b60004f3
Merge branch 'develop' into release/v0.6.0
2016-11-04 16:08:07 +00:00
d5e95bc350
Merge branch 'release/v0.6.0' into feature/feynman-rules
2016-10-31 18:36:21 +00:00
Guido Cossu
e1042aef77
First version of the doube prec for testing purposes
...
It does not compile single and double version at the same time
2016-10-28 17:20:04 +01:00
paboyle
aa6a839c60
avx512 build fix; detect clang/gcc intrinsics vs. ICPC
2016-10-28 09:13:09 +01:00
ca21003f01
Merge branch 'feature/fft-opt' into feature/feynman-rules
...
# Conflicts:
# lib/FFT.h
# lib/qcd/action/fermion/WilsonFermion5D.h
# tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
azusayamaguchi
460d0753a1
Merge branch 'develop' into feature/mpi3
...
Conflicts:
lib/simd/Grid_avx512.h
2016-10-25 01:08:51 +01:00
azusayamaguchi
75ebd3a0d1
Typo fixes and rotate for CLANG
2016-10-21 22:34:29 +01:00
bd6a228af6
Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules
2016-10-21 13:10:30 +01:00
azusayamaguchi
20a091c3ed
Intel vs. Clang intrinsics differences absorbed
2016-10-21 09:08:36 +01:00
997fd882ff
Merge branch 'develop' into feature/feynman-rules
...
# Conflicts:
# lib/Threads.h
# lib/qcd/action/fermion/WilsonFermion.cc
# lib/qcd/action/fermion/WilsonFermion.h
# lib/qcd/utils/SUn.h
# lib/simd/Grid_avx.h
# lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
paboyle
811ca45473
GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support
2016-10-17 16:23:21 +01:00
azusayamaguchi
81f2aeaece
KNL streaming stores, and KNL performance coutners
2016-10-12 11:45:22 +01:00
paboyle
6f408256bc
FMA4 option moved on the align
2016-10-11 10:03:01 +01:00
paboyle
8d11681aac
verbose remove
2016-10-10 23:50:42 +01:00
paboyle
3d5c9a1ee9
No compile fix on clang++ 3.9
2016-10-10 23:50:13 +01:00
Guido Cossu
611b5d74ba
Fix for AVX+FMA3 compilation
2016-10-10 15:26:17 +01:00
cb02b7088f
Merge branch 'develop' into feature/doxygen
...
# Conflicts:
# configure.ac
2016-10-09 13:35:44 +01:00
paboyle
87acd06990
Use streaming stores
2016-09-26 10:11:34 +01:00
paboyle
836e929565
Divide handling improved
2016-09-26 09:42:22 +01:00
Antonin Portelli
0724f7af75
QPX single precision implementation
2016-09-19 18:09:12 +01:00
4d11a6f5f2
first commit for QPX intrinsics
2016-08-23 14:41:44 +01:00
paboyle
17097a93ec
FFTW test ran over 4 mpi processes.
2016-08-17 01:33:55 +01:00
b1cfb4d661
first try at a nicer Doxygen implementation
2016-08-05 15:29:18 +01:00
93d29bb699
build system improvements after discussion with Peter
2016-08-04 16:19:59 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
paboyle
4908b77d46
Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created
...
a HUGE amount of difficult to resolve and understand conflicts .
Wholesale formatting, reordering functions etc... in a central file like Tensor_class
or Grid_vector_types while others are also editing without making substantial functionality
changes creates pain.
2016-07-15 20:59:07 +01:00