1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-15 02:05:37 +00:00
Commit Graph

1862 Commits

Author SHA1 Message Date
Guido Cossu
453cf2a1c6 Moving the topological charge outside the HMC related routines 2017-05-02 14:40:12 +01:00
Guido Cossu
de7bbfa5f9 Adding ParameterFile option for the HMC 2017-05-02 12:16:16 +01:00
Chulwoo Jung
867fe93018 First Rotate reorg done. 2017-05-02 01:26:22 -04:00
Chulwoo Jung
09651c3326 Checking in before rearranging Lanczos 2017-05-02 00:47:18 -04:00
Chulwoo Jung
f87f2a3f8b Merge branch 'develop' of https://github.com/paboyle/Grid into feature/Lanczos 2017-05-01 12:00:47 -04:00
Guido Cossu
74f451715f Fix for Mac compilation on the size_t uint64_t types 2017-05-01 15:12:07 +01:00
Guido Cossu
4063238943 Adding HMC test file example for Mobius + smearing 2017-05-01 13:44:00 +01:00
Guido Cossu
3344788fa1 Merge branch 'develop' into feature/hmc_generalise 2017-05-01 12:13:56 +01:00
Peter Boyle
99220f6531 Fixes and better timing 2017-04-26 17:24:11 -04:00
Peter Boyle
f8797e1e3e bug fix. works now and great face performance 2017-04-26 03:14:02 -04:00
Peter Boyle
fd1eb7de13 Clean implementation of the exterior faces listing only those points on the boudary 2017-04-26 02:34:52 -04:00
Peter Boyle
2ce898efa3 Pretty code 2017-04-26 02:34:25 -04:00
paboyle
ab66bac4e6 Think I'm getting on top of the reduced cost exterior precomputed list of links 2017-04-25 08:50:26 +01:00
paboyle
56277a11c8 Build a list of whats on the surface 2017-04-24 17:06:15 +01:00
Peter Boyle
5b55867a7a Slightly cheaper Ext assembly 2017-04-24 05:36:11 -04:00
Peter Boyle
3accb1ef89 Debugged assemply split phase with interior suppression 2017-04-23 19:30:19 -04:00
Peter Boyle
e3d0e31525 Debugged assemply split phase with interior suppression 2017-04-23 19:29:27 -04:00
Peter Boyle
5812eb8a8c Partially fixed. But the comms-overlap does not work yet. 2017-04-22 18:50:25 -04:00
paboyle
ac58565d0a Dangerous rewrite of the assembly. If I make a mistake the debug will be painful. 2017-04-22 19:31:04 +01:00
paboyle
3703b718aa Mark up a table if a given site only receives from itself; including MPI3 splitting info. 2017-04-22 19:28:37 +01:00
paboyle
b722889234 Try a better load balancing loop 2017-04-22 19:27:41 +01:00
paboyle
abba44a837 Hand unrolled for overlapped comms 2017-04-22 17:45:17 +01:00
paboyle
f301be94ce Fixed 2017-04-22 17:42:31 +01:00
Peter Boyle
1d1b225497 Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). 2017-04-22 09:05:28 -04:00
Peter Boyle
53a785a3dd Fixing the KNL compile 2017-04-22 08:11:51 -04:00
paboyle
736bf3c866 Major rework of stencil. Half precision and MPI3 now working. 2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188 L1p config bg/q 2017-04-22 11:33:09 +01:00
paboyle
3844bcf800 If no f16c instructions supported must use software half precision conversion.
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
e1a2319d01 Simple compressor moved out of cshift into stencil 2017-04-20 13:18:15 +01:00
paboyle
180c732b4c Move compressors out of Cshift.
Slice iterators would help
2017-04-20 13:17:55 +01:00
paboyle
d2312e9874 Drop compressor entirely from Cshift to only Stencil. 2017-04-20 13:16:55 +01:00
paboyle
fc4ab9ccd5 Working half precision comms 2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca Massive compressor rework to support reduced precision comms 2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5 Type comparison in the traits work 2017-04-18 13:28:04 +01:00
paboyle
557c3fa109 Pretty change 2017-04-18 13:27:38 +01:00
paboyle
8e161152e4 MultiRHS solver improvements with slice operations moved into lattice and sped up.
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
3141ebac10 MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled. 2017-04-17 10:50:19 +01:00
paboyle
7ede696126 Non compile of tests fixed 2017-04-16 23:40:00 +01:00
Chulwoo Jung
a07556dd5f Added back the convergence test from evecs of tridiagonal matrix. Bugfixes 2017-04-15 09:32:15 -04:00
paboyle
bf516c3b81 higher precision reduction variables in norm and inner product 2017-04-15 12:27:28 +01:00
paboyle
441a52ee5d First cut at higher precision reduction 2017-04-15 10:57:21 +01:00
paboyle
a8db024c92 Cleaning up the dense matrix and lanczos sector 2017-04-15 08:54:11 +01:00
paboyle
3ca41458a3 Fix to no USE_FP16 case 2017-04-14 14:20:54 +01:00
Peter Boyle
951be75292 Half precision conversion working on AVX512 now too 2017-04-13 17:35:11 +01:00
Peter Boyle
b9113ed310 Patches for knl 2017-04-13 12:02:12 -04:00
paboyle
42fb49d3fd Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-04-13 14:12:47 +01:00
paboyle
db5ea001a3 Update to use Xcode 8.3 since -mfp16 causes SIGILL 2017-04-13 12:22:40 +01:00
paboyle
1d502e4ed6 FP16 optional compile time 2017-04-13 11:55:24 +01:00
paboyle
73cdf0fffe Drop f16c from SSE because of a macos compile error on travis 2017-04-13 11:23:41 +01:00
paboyle
1c25773319 Trap illegal instructions 2017-04-13 10:51:40 +01:00
paboyle
94eb829d08 Align cast fixed for __mm128i gcc complained 2017-04-13 08:40:44 +01:00
paboyle
68392ddb5b Exchange in generic
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle
cb6b81ae82 Half precision conversion 2017-04-12 19:32:37 +01:00
53e76b41d2 Merge branch 'develop' into feature/hadrons 2017-04-10 17:00:53 +01:00
8ef4300412 spurious .dirstamp files removed 2017-04-10 17:00:22 +01:00
98a24ebf31 The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future. 2017-04-10 16:58:54 +01:00
paboyle
b12dc89d26 Commenting and clean up 2017-04-10 20:38:20 +09:00
paboyle
d80d802f9d MultiRHS solver test 2017-04-10 00:12:12 +09:00
paboyle
3d99b09dba Start of blockCG 2017-04-09 23:42:10 +09:00
paboyle
db5f6d3ae3 Verbose fix 2017-04-09 23:41:30 +09:00
paboyle
683550f116 Const args improvement 2017-04-09 23:41:04 +09:00
Chulwoo Jung
f80a847aef Merge branch 'develop' into bugfix/dminus 2017-04-06 23:49:10 -04:00
Chulwoo Jung
93cb5d4e97 Working version of Lanczos without the extra copy. 2017-04-06 23:35:30 -04:00
Chulwoo Jung
9e48b7dfda MEM_SAVE in Lanczos seems to be working, but not pretty 2017-04-06 22:21:56 -04:00
paboyle
86aaa35294 Christoph needs SchurDiagTwoKappa which is mobius specific. 2017-04-07 11:07:40 +09:00
Guido Cossu
8c540333d5 Merge branch 'develop' into feature/hmc_generalise 2017-04-05 14:41:04 +01:00
Chulwoo Jung
d0c2c9c71f Merge branch 'develop' of https://github.com/paboyle/Grid into bugfix/dminus 2017-04-04 15:20:17 -04:00
Chulwoo Jung
c8cafa77ca Checking in the latest Lacnzos 2017-04-04 15:18:12 -04:00
paboyle
5592f7b8c1 Creation mode better implementation 2017-04-05 02:35:34 +09:00
paboyle
35da4ece0b UID fix 2017-04-05 02:18:15 +09:00
ff4e54ef80 Merge branch 'develop' into feature/hadrons 2017-04-03 18:56:21 +01:00
paboyle
83f6fab8fa Big/Small crush test, and fast SITMO rng init, faster but not ideal
MT and Ranlux init.
2017-04-02 12:10:51 +09:00
paboyle
9dc7ca4c3b Sitmo fast init 2017-04-02 00:28:22 +09:00
paboyle
935d82f5b1 sanity checks 2017-04-02 00:27:28 +09:00
paboyle
9cbcdd65d7 No random device seed 2017-04-02 00:26:57 +09:00
paboyle
7e5faa0f34 Multiple RNGs 2017-04-02 00:25:44 +09:00
paboyle
1c4bc7ed38 Debugged staggered conventions 2017-03-31 14:41:48 +09:00
Chulwoo Jung
a3bcad3804 Added preconditioned SYM2 solver (SchurRedBlackDiagTwoSolve) 2017-03-30 20:33:27 -04:00
Chulwoo Jung
5a5b66292b Merge branch 'develop' of https://github.com/paboyle/Grid into bugfix/dminus 2017-03-30 10:44:02 -04:00
paboyle
93ea5d9468 Pretty code 2017-03-30 15:00:03 +09:00
paboyle
9fd23faadf Pretty layout 2017-03-30 13:44:45 +09:00
paboyle
10e4fa0dc8 Template instantiation improvements 2017-03-30 13:44:25 +09:00
paboyle
c4aca1dde4 Conjugate coefficients on adjoint 2017-03-30 13:44:05 +09:00
paboyle
b9e8ea3aaa conjugate coefficient on the dagger 2017-03-30 13:43:13 +09:00
paboyle
077aa728b9 Fix the ZMobius (I think) 2017-03-30 13:42:09 +09:00
paboyle
a8d83d886e Macro controls 2017-03-30 13:31:34 +09:00
paboyle
7fd46eeec4 Trailing whitespace removal 2017-03-30 13:31:10 +09:00
paboyle
2b115929dc Small AVX512 asm ifdef patch 2017-03-29 18:51:23 +09:00
paboyle
417ec56cca Release candidate 2017-03-29 05:45:33 -04:00
paboyle
756bc25008 Verbose header print by default 2017-03-29 04:44:17 -04:00
paboyle
35695ba57a Bug fix in MPI3 2017-03-29 04:43:55 -04:00
paboyle
d805867e02 Better init 2017-03-28 13:25:05 -04:00
paboyle
98f9318279 Build on AVX2 and MPI passing with clang++ 2017-03-28 23:16:04 +09:00
paboyle
4b17e8eba8 Merge branch 'develop' into feature/bgq-asm
Conflicts:
	lib/qcd/action/fermion/Fermion.h
	lib/qcd/action/fermion/WilsonFermion.cc
	lib/util/Init.cc
	tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
Chulwoo Jung
e63be32ad2 zmobius Meooe5D fixed? 2017-03-28 03:48:50 -04:00
paboyle
75112a632a IO improvements to fail on IO error 2017-03-28 02:28:04 -04:00
paboyle
18bde08d1b Merge branch 'feature/staggering' into develop 2017-03-28 15:25:55 +09:00
Chulwoo Jung
33d59c8869 Adding Zmobius prec test 2017-03-27 21:40:27 -04:00
Chulwoo Jung
a833fd8dbf Merge branch 'develop' of https://github.com/paboyle/Grid into bugfix/dminus 2017-03-27 21:37:26 -04:00
Guido Cossu
4c1ea8677e Small cosmetic changes and vscode gitignore 2017-03-23 14:09:35 +09:00