paboyle
5e2cd0d07c
Format
2018-01-12 23:18:22 +00:00
paboyle
62fcee72c5
Format, NAMESPACE
2018-01-12 23:16:37 +00:00
paboyle
0a6168eef0
Format emacs style
2018-01-12 23:11:22 +00:00
paboyle
63865e4232
format
2018-01-12 23:10:48 +00:00
paboyle
c64deedf74
Format
2018-01-12 23:09:35 +00:00
paboyle
3281559ec3
Format
2018-01-12 23:09:01 +00:00
paboyle
6a2eca2ec2
NAMESAPCE
2018-01-12 23:00:03 +00:00
paboyle
d8ff895e74
NAMESPACE and format
2018-01-12 18:27:22 +00:00
paboyle
00c49d4c17
Format
2018-01-12 18:25:39 +00:00
paboyle
ec89714cce
NAMESPACE
2018-01-12 18:24:16 +00:00
paboyle
6ab744c720
NAMESPACE and formatting
2018-01-12 18:11:04 +00:00
paboyle
bbb657da5c
NAMESPACE and formatting
2018-01-12 18:10:11 +00:00
paboyle
fbc2380cb8
NAMESPACE & format
2018-01-12 18:05:36 +00:00
paboyle
08682c5461
NAMESPACE and format to my liking
2018-01-12 18:03:57 +00:00
paboyle
13bce2a6bf
NAMESPACE
2018-01-12 17:58:53 +00:00
paboyle
70e689900b
NAMESPACE
2018-01-12 17:58:13 +00:00
Peter Boyle
bfb68e6f02
Merge pull request #130 from giltirn/gparity-handunroll
...
Gparity handunroll
2017-09-21 10:11:00 +01:00
Nils Meyer
4e907fef2c
Merge remote-tracking branch 'grid/develop' into feature/arm-neon
2017-08-29 17:47:36 +02:00
Christopher Kelly
f365a83fae
In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions
2017-08-25 14:24:11 -04:00
Nils Meyer
7a53dc3715
Added integer reduce functionality
2017-07-24 11:12:59 +02:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9
Merge pull request #114 from nmeyer-ur/feature/arm-neon
...
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
Nils Meyer
0933aeefd4
corrected Grid_neon.h
2017-06-28 20:22:22 +02:00
Nils Meyer
a9c816a268
moved file to correct folder
2017-06-27 21:39:15 +02:00
Nils Meyer
bf729766dd
removed collision with QPX implementation
2017-06-27 20:32:24 +02:00
Lanny91
56abbdf4c2
AVX512 integer reduce fix (for non-intel compiler)
2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c
AVX2 fix
2017-06-23 11:03:12 +02:00
Lanny91
0440d4ce66
Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq
2017-06-22 17:09:42 +02:00
Azusa Yamaguchi
abc4de0fd2
No compile make tests fix
2017-06-19 22:03:03 +01:00
Lanny91
a833f88c32
Added missing SIMD integer reduction implementation for AVX, AVX-512, SSE4, IMCI
2017-06-16 15:58:47 +01:00
Lanny91
07b2c1b253
Placeholder precision change functions to allow Grid to compile with QPX (warning: no actual functionality)
2017-06-16 15:04:26 +01:00
Lanny91
735cbdb983
QPX Integer reduction (+ integer reduction test)
2017-06-14 10:55:10 +01:00
Lanny91
2ad54c5a02
QPX exchange support
2017-06-14 10:53:39 +01:00
Nils Meyer
3d04dc33c6
ARM neon intrinsics support
2017-06-13 13:26:59 +02:00
paboyle
62cf9cf638
Cleaner code
2017-05-30 23:38:02 +01:00
Guido Cossu
15e801af3f
Fixing a compilation error for generic SIMD
2017-05-19 16:39:36 +01:00
paboyle
3267683e22
Union workaround for g++
2017-05-17 11:26:18 +01:00
paboyle
c1c7566089
GCC bug work around in 5.0 through 6.2 inclusive.
2017-05-06 15:20:25 +01:00
Guido Cossu
3344788fa1
Merge branch 'develop' into feature/hmc_generalise
2017-05-01 12:13:56 +01:00
paboyle
56277a11c8
Build a list of whats on the surface
2017-04-24 17:06:15 +01:00
paboyle
736bf3c866
Major rework of stencil. Half precision and MPI3 now working.
2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188
L1p config bg/q
2017-04-22 11:33:09 +01:00
paboyle
3844bcf800
If no f16c instructions supported must use software half precision conversion.
...
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
4a340aa5ca
Massive compressor rework to support reduced precision comms
2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5
Type comparison in the traits work
2017-04-18 13:28:04 +01:00
paboyle
8e161152e4
MultiRHS solver improvements with slice operations moved into lattice and sped up.
...
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
7ede696126
Non compile of tests fixed
2017-04-16 23:40:00 +01:00
paboyle
441a52ee5d
First cut at higher precision reduction
2017-04-15 10:57:21 +01:00
paboyle
3ca41458a3
Fix to no USE_FP16 case
2017-04-14 14:20:54 +01:00
Peter Boyle
951be75292
Half precision conversion working on AVX512 now too
2017-04-13 17:35:11 +01:00