paboyle
|
8e161152e4
|
MultiRHS solver improvements with slice operations moved into lattice and sped up.
Block solver requires a lot of performance work.
|
2017-04-18 10:51:55 +01:00 |
|
paboyle
|
3141ebac10
|
MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled.
|
2017-04-17 10:50:19 +01:00 |
|
paboyle
|
7ede696126
|
Non compile of tests fixed
|
2017-04-16 23:40:00 +01:00 |
|
paboyle
|
bf516c3b81
|
higher precision reduction variables in norm and inner product
|
2017-04-15 12:27:28 +01:00 |
|
paboyle
|
441a52ee5d
|
First cut at higher precision reduction
|
2017-04-15 10:57:21 +01:00 |
|
paboyle
|
a8db024c92
|
Cleaning up the dense matrix and lanczos sector
|
2017-04-15 08:54:11 +01:00 |
|
paboyle
|
3ca41458a3
|
Fix to no USE_FP16 case
|
2017-04-14 14:20:54 +01:00 |
|
Peter Boyle
|
951be75292
|
Half precision conversion working on AVX512 now too
|
2017-04-13 17:35:11 +01:00 |
|
Peter Boyle
|
b9113ed310
|
Patches for knl
|
2017-04-13 12:02:12 -04:00 |
|
paboyle
|
42fb49d3fd
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2017-04-13 14:12:47 +01:00 |
|
paboyle
|
db5ea001a3
|
Update to use Xcode 8.3 since -mfp16 causes SIGILL
|
2017-04-13 12:22:40 +01:00 |
|
paboyle
|
1d502e4ed6
|
FP16 optional compile time
|
2017-04-13 11:55:24 +01:00 |
|
paboyle
|
73cdf0fffe
|
Drop f16c from SSE because of a macos compile error on travis
|
2017-04-13 11:23:41 +01:00 |
|
paboyle
|
1c25773319
|
Trap illegal instructions
|
2017-04-13 10:51:40 +01:00 |
|
paboyle
|
94eb829d08
|
Align cast fixed for __mm128i gcc complained
|
2017-04-13 08:40:44 +01:00 |
|
paboyle
|
68392ddb5b
|
Exchange in generic
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
|
2017-04-13 08:38:12 +01:00 |
|
paboyle
|
cb6b81ae82
|
Half precision conversion
|
2017-04-12 19:32:37 +01:00 |
|
|
8ef4300412
|
spurious .dirstamp files removed
|
2017-04-10 17:00:22 +01:00 |
|
|
98a24ebf31
|
The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future.
|
2017-04-10 16:58:54 +01:00 |
|
paboyle
|
b12dc89d26
|
Commenting and clean up
|
2017-04-10 20:38:20 +09:00 |
|
paboyle
|
d80d802f9d
|
MultiRHS solver test
|
2017-04-10 00:12:12 +09:00 |
|
paboyle
|
3d99b09dba
|
Start of blockCG
|
2017-04-09 23:42:10 +09:00 |
|
paboyle
|
db5f6d3ae3
|
Verbose fix
|
2017-04-09 23:41:30 +09:00 |
|
paboyle
|
683550f116
|
Const args improvement
|
2017-04-09 23:41:04 +09:00 |
|
paboyle
|
86aaa35294
|
Christoph needs SchurDiagTwoKappa which is mobius specific.
|
2017-04-07 11:07:40 +09:00 |
|
paboyle
|
5592f7b8c1
|
Creation mode better implementation
|
2017-04-05 02:35:34 +09:00 |
|
paboyle
|
35da4ece0b
|
UID fix
|
2017-04-05 02:18:15 +09:00 |
|
paboyle
|
83f6fab8fa
|
Big/Small crush test, and fast SITMO rng init, faster but not ideal
MT and Ranlux init.
|
2017-04-02 12:10:51 +09:00 |
|
paboyle
|
9dc7ca4c3b
|
Sitmo fast init
|
2017-04-02 00:28:22 +09:00 |
|
paboyle
|
935d82f5b1
|
sanity checks
|
2017-04-02 00:27:28 +09:00 |
|
paboyle
|
9cbcdd65d7
|
No random device seed
|
2017-04-02 00:26:57 +09:00 |
|
paboyle
|
7e5faa0f34
|
Multiple RNGs
|
2017-04-02 00:25:44 +09:00 |
|
paboyle
|
1c4bc7ed38
|
Debugged staggered conventions
|
2017-03-31 14:41:48 +09:00 |
|
paboyle
|
93ea5d9468
|
Pretty code
|
2017-03-30 15:00:03 +09:00 |
|
paboyle
|
9fd23faadf
|
Pretty layout
|
2017-03-30 13:44:45 +09:00 |
|
paboyle
|
10e4fa0dc8
|
Template instantiation improvements
|
2017-03-30 13:44:25 +09:00 |
|
paboyle
|
c4aca1dde4
|
Conjugate coefficients on adjoint
|
2017-03-30 13:44:05 +09:00 |
|
paboyle
|
b9e8ea3aaa
|
conjugate coefficient on the dagger
|
2017-03-30 13:43:13 +09:00 |
|
paboyle
|
077aa728b9
|
Fix the ZMobius (I think)
|
2017-03-30 13:42:09 +09:00 |
|
paboyle
|
a8d83d886e
|
Macro controls
|
2017-03-30 13:31:34 +09:00 |
|
paboyle
|
7fd46eeec4
|
Trailing whitespace removal
|
2017-03-30 13:31:10 +09:00 |
|
paboyle
|
2b115929dc
|
Small AVX512 asm ifdef patch
|
2017-03-29 18:51:23 +09:00 |
|
paboyle
|
417ec56cca
|
Release candidate
|
2017-03-29 05:45:33 -04:00 |
|
paboyle
|
756bc25008
|
Verbose header print by default
|
2017-03-29 04:44:17 -04:00 |
|
paboyle
|
35695ba57a
|
Bug fix in MPI3
|
2017-03-29 04:43:55 -04:00 |
|
paboyle
|
d805867e02
|
Better init
|
2017-03-28 13:25:05 -04:00 |
|
paboyle
|
98f9318279
|
Build on AVX2 and MPI passing with clang++
|
2017-03-28 23:16:04 +09:00 |
|
paboyle
|
4b17e8eba8
|
Merge branch 'develop' into feature/bgq-asm
Conflicts:
lib/qcd/action/fermion/Fermion.h
lib/qcd/action/fermion/WilsonFermion.cc
lib/util/Init.cc
tests/Test_cayley_even_odd_vec.cc
|
2017-03-28 04:49:30 -04:00 |
|
paboyle
|
75112a632a
|
IO improvements to fail on IO error
|
2017-03-28 02:28:04 -04:00 |
|
paboyle
|
18bde08d1b
|
Merge branch 'feature/staggering' into develop
|
2017-03-28 15:25:55 +09:00 |
|