1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 02:31:01 +01:00
Commit Graph

1672 Commits

Author SHA1 Message Date
paboyle 4a340aa5ca Massive compressor rework to support reduced precision comms 2017-04-20 09:28:27 +01:00
paboyle 3b7de792d5 Type comparison in the traits work 2017-04-18 13:28:04 +01:00
paboyle 557c3fa109 Pretty change 2017-04-18 13:27:38 +01:00
paboyle 8e161152e4 MultiRHS solver improvements with slice operations moved into lattice and sped up.
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle 3141ebac10 MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled. 2017-04-17 10:50:19 +01:00
paboyle 7ede696126 Non compile of tests fixed 2017-04-16 23:40:00 +01:00
paboyle bf516c3b81 higher precision reduction variables in norm and inner product 2017-04-15 12:27:28 +01:00
paboyle 441a52ee5d First cut at higher precision reduction 2017-04-15 10:57:21 +01:00
paboyle a8db024c92 Cleaning up the dense matrix and lanczos sector 2017-04-15 08:54:11 +01:00
paboyle 3ca41458a3 Fix to no USE_FP16 case 2017-04-14 14:20:54 +01:00
Peter Boyle 951be75292 Half precision conversion working on AVX512 now too 2017-04-13 17:35:11 +01:00
Peter Boyle b9113ed310 Patches for knl 2017-04-13 12:02:12 -04:00
paboyle 42fb49d3fd Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-04-13 14:12:47 +01:00
paboyle db5ea001a3 Update to use Xcode 8.3 since -mfp16 causes SIGILL 2017-04-13 12:22:40 +01:00
paboyle 1d502e4ed6 FP16 optional compile time 2017-04-13 11:55:24 +01:00
paboyle 73cdf0fffe Drop f16c from SSE because of a macos compile error on travis 2017-04-13 11:23:41 +01:00
paboyle 1c25773319 Trap illegal instructions 2017-04-13 10:51:40 +01:00
paboyle 94eb829d08 Align cast fixed for __mm128i gcc complained 2017-04-13 08:40:44 +01:00
paboyle 68392ddb5b Exchange in generic
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle cb6b81ae82 Half precision conversion 2017-04-12 19:32:37 +01:00
portelli 8ef4300412 spurious .dirstamp files removed 2017-04-10 17:00:22 +01:00
portelli 98a24ebf31 The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future. 2017-04-10 16:58:54 +01:00
paboyle b12dc89d26 Commenting and clean up 2017-04-10 20:38:20 +09:00
paboyle d80d802f9d MultiRHS solver test 2017-04-10 00:12:12 +09:00
paboyle 3d99b09dba Start of blockCG 2017-04-09 23:42:10 +09:00
paboyle db5f6d3ae3 Verbose fix 2017-04-09 23:41:30 +09:00
paboyle 683550f116 Const args improvement 2017-04-09 23:41:04 +09:00
paboyle 86aaa35294 Christoph needs SchurDiagTwoKappa which is mobius specific. 2017-04-07 11:07:40 +09:00
paboyle 5592f7b8c1 Creation mode better implementation 2017-04-05 02:35:34 +09:00
paboyle 35da4ece0b UID fix 2017-04-05 02:18:15 +09:00
paboyle 83f6fab8fa Big/Small crush test, and fast SITMO rng init, faster but not ideal
MT and Ranlux init.
2017-04-02 12:10:51 +09:00
paboyle 9dc7ca4c3b Sitmo fast init 2017-04-02 00:28:22 +09:00
paboyle 935d82f5b1 sanity checks 2017-04-02 00:27:28 +09:00
paboyle 9cbcdd65d7 No random device seed 2017-04-02 00:26:57 +09:00
paboyle 7e5faa0f34 Multiple RNGs 2017-04-02 00:25:44 +09:00
paboyle 1c4bc7ed38 Debugged staggered conventions 2017-03-31 14:41:48 +09:00
paboyle 93ea5d9468 Pretty code 2017-03-30 15:00:03 +09:00
paboyle 9fd23faadf Pretty layout 2017-03-30 13:44:45 +09:00
paboyle 10e4fa0dc8 Template instantiation improvements 2017-03-30 13:44:25 +09:00
paboyle c4aca1dde4 Conjugate coefficients on adjoint 2017-03-30 13:44:05 +09:00
paboyle b9e8ea3aaa conjugate coefficient on the dagger 2017-03-30 13:43:13 +09:00
paboyle 077aa728b9 Fix the ZMobius (I think) 2017-03-30 13:42:09 +09:00
paboyle a8d83d886e Macro controls 2017-03-30 13:31:34 +09:00
paboyle 7fd46eeec4 Trailing whitespace removal 2017-03-30 13:31:10 +09:00
paboyle 2b115929dc Small AVX512 asm ifdef patch 2017-03-29 18:51:23 +09:00
paboyle 417ec56cca Release candidate 2017-03-29 05:45:33 -04:00
paboyle 756bc25008 Verbose header print by default 2017-03-29 04:44:17 -04:00
paboyle 35695ba57a Bug fix in MPI3 2017-03-29 04:43:55 -04:00
paboyle d805867e02 Better init 2017-03-28 13:25:05 -04:00
paboyle 98f9318279 Build on AVX2 and MPI passing with clang++ 2017-03-28 23:16:04 +09:00