1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-24 17:54:47 +01:00
Commit Graph

182 Commits

Author SHA1 Message Date
paboyle
56277a11c8 Build a list of whats on the surface 2017-04-24 17:06:15 +01:00
paboyle
736bf3c866 Major rework of stencil. Half precision and MPI3 now working. 2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188 L1p config bg/q 2017-04-22 11:33:09 +01:00
paboyle
3844bcf800 If no f16c instructions supported must use software half precision conversion.
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
4a340aa5ca Massive compressor rework to support reduced precision comms 2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5 Type comparison in the traits work 2017-04-18 13:28:04 +01:00
paboyle
8e161152e4 MultiRHS solver improvements with slice operations moved into lattice and sped up.
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
7ede696126 Non compile of tests fixed 2017-04-16 23:40:00 +01:00
paboyle
441a52ee5d First cut at higher precision reduction 2017-04-15 10:57:21 +01:00
paboyle
3ca41458a3 Fix to no USE_FP16 case 2017-04-14 14:20:54 +01:00
Peter Boyle
951be75292 Half precision conversion working on AVX512 now too 2017-04-13 17:35:11 +01:00
Peter Boyle
b9113ed310 Patches for knl 2017-04-13 12:02:12 -04:00
paboyle
db5ea001a3 Update to use Xcode 8.3 since -mfp16 causes SIGILL 2017-04-13 12:22:40 +01:00
paboyle
1d502e4ed6 FP16 optional compile time 2017-04-13 11:55:24 +01:00
paboyle
73cdf0fffe Drop f16c from SSE because of a macos compile error on travis 2017-04-13 11:23:41 +01:00
paboyle
94eb829d08 Align cast fixed for __mm128i gcc complained 2017-04-13 08:40:44 +01:00
paboyle
68392ddb5b Exchange in generic
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle
cb6b81ae82 Half precision conversion 2017-04-12 19:32:37 +01:00
paboyle
4ed10a3d06 Merge branch 'develop' into feature/bgq-asm 2017-03-13 11:10:10 +00:00
Lanny91
7fe797daf8 SIMD vector length sanity checks 2017-02-23 16:49:44 +00:00
Lanny91
486a01294a Corrected QPX SIMD width 2017-02-23 16:47:56 +00:00
paboyle
4e7ab3166f Refactoring header layout 2017-02-22 18:09:33 +00:00
Lanny91
c80948411b Added tRotate function and MaddRealPart struct for generic SIMD, bugfix in MultRealPart and minor cosmetic changes. 2017-02-22 14:57:10 +00:00
Lanny91
95625a7bd1 Use Grid Integer type 2017-02-22 13:09:32 +00:00
Lanny91
0796696733 Emulated integer vector type for QPX and generic SIMD instruction sets. 2017-02-22 12:01:36 +00:00
paboyle
661fc4d3d1 Debug AVX512 exchange code paths 2017-02-20 17:48:36 -05:00
paboyle
f246fe3304 Improvements to avx for invertible to avoid latent bug 2017-02-16 23:52:44 +00:00
paboyle
bd600702cf Vectorise the XYZT face gathering better.
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
2017-02-15 11:11:04 +00:00
paboyle
0883d6a7ce Overlap comms compute support; make reg naming consistent with bgq aasm 2017-02-07 00:59:32 -05:00
paboyle
4bbdfb434c Overlap comms compute modifications 2017-02-07 00:57:01 -05:00
Peter Boyle
c3b6d573b9 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2016-12-30 22:42:17 +00:00
Peter Boyle
1e179c903d Worried about integer; suspect where statements are broken 2016-12-27 17:46:38 +00:00
Peter Boyle
1caa3fbc2d LOCK UNLOCK only 2016-12-27 11:24:45 +00:00
Peter Boyle
eabf316ed9 BGQ performance ASM 2016-12-22 21:56:08 +00:00
Peter Boyle
7dc36628a1 QPX finishing 2016-12-22 17:50:48 +00:00
paboyle
3f2d53a994 BGQ assembler beginning 2016-12-20 10:21:26 +00:00
paboyle
4b220972ac Warning fix 2016-12-18 02:14:17 +00:00
paboyle
629f43e36c Return statement needed 2016-12-18 02:09:37 +00:00
paboyle
a3172b3455 Precision error 2016-12-18 02:07:45 +00:00
paboyle
f17436fec2 Bad commit fixed 2016-12-18 01:27:34 +00:00
Peter Boyle
69ae817d1c Updates for supporting Mobius better 2016-12-08 16:43:28 +00:00
Peter Boyle
e27c6b217c Updating 2016-12-01 12:42:53 +00:00
paboyle
6adf35da54 Faster Mobius 2016-12-01 11:39:04 +00:00
Lanny91
b18950f776 Added simd real divide test with QPX divide fixes 2016-11-25 13:21:33 +00:00
Lanny91
0acbf77bc6 Add QPX Div structure 2016-11-24 13:24:12 +00:00
a2cffb0304 AVXFMA target fixed 2016-11-21 17:47:18 +01:00
97cddda49e Merge branch 'feature/gen-simd' into feature/doxygen
# Conflicts:
#	Makefile.am
#	configure.ac
2016-11-19 13:11:13 +01:00
b873504b90 fully generic SIMD 2016-11-19 01:32:39 +01:00
042ae5b87c generic 256bits SIMD 2016-11-15 12:16:15 +00:00
azusayamaguchi
f7b60004f3 Merge branch 'develop' into release/v0.6.0 2016-11-04 16:08:07 +00:00