paboyle
dee68fc728
IO working multiple nodes again. Strategy of all nodes writing metadata is unsafe.
...
Only one rank should do this. must identify this rank. Means pass communicator to the
Objects.
2017-07-02 23:33:48 +01:00
paboyle
57002924bc
NERSC shakeout of this
2017-07-02 14:58:30 -07:00
Peter Boyle
a0be3f7330
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-30 10:53:50 +01:00
Peter Boyle
b5a6e4f1fd
Best option for Xeon cache blocking set
2017-06-30 10:53:22 +01:00
Peter Boyle
7a788db3dc
Guard first touch
2017-06-30 10:49:08 +01:00
Peter Boyle
f20eceb6cd
First touch once per page in a threaded loop
2017-06-30 10:48:27 +01:00
Peter Boyle
38325ebbc6
Interleave code path; not enabled
2017-06-30 10:23:51 +01:00
Peter Boyle
ac1f1838bc
KNL only
2017-06-30 10:15:32 +01:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9
Merge pull request #114 from nmeyer-ur/feature/arm-neon
...
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
Nils Meyer
0933aeefd4
corrected Grid_neon.h
2017-06-28 20:22:22 +02:00
07de925127
minor scalar action fixes
2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268
moved file to correct folder
2017-06-27 21:39:15 +02:00
Nils Meyer
bf729766dd
removed collision with QPX implementation
2017-06-27 20:32:24 +02:00
0b707b861c
Merge branch 'develop' into feature/scalar-hmc-update
2017-06-27 14:40:05 +01:00
15e87a4607
HDF5 IO fix
2017-06-27 14:39:27 +01:00
7d7220cbd7
scalar: lambda/4! convention
2017-06-27 14:38:45 +01:00
paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
0af740dc15
minor scalar HMC code improvement
2017-06-24 23:04:05 +01:00
d2e8372df3
SU(N) algebra fix (was not working)
2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e
Threaded calls to multiple communicators
2017-06-24 10:55:54 +01:00
paboyle
349d75e483
Precision fix
2017-06-23 02:57:59 -07:00
Lanny91
56abbdf4c2
AVX512 integer reduce fix (for non-intel compiler)
2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c
AVX2 fix
2017-06-23 11:03:12 +02:00
paboyle
1feddf4ba6
const fixes
2017-06-22 19:32:41 +01:00
paboyle
e504260f3d
Able to run a test job splitting into multiple MPI subdomains.
2017-06-22 18:53:11 +01:00
Lanny91
0440d4ce66
Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq
2017-06-22 17:09:42 +02:00
paboyle
5e4bea8f20
Benchmark DWF works
2017-06-22 08:38:54 +01:00
paboyle
6ebf9f15b7
Splitting communicators first cut
2017-06-22 08:14:34 +01:00
paboyle
b9104f3072
Block CG
2017-06-21 21:08:03 +01:00
b22eab8c8b
Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons
2017-06-21 18:32:48 +01:00
paboyle
e8b95bd35b
Clean up finished. Could shrink Lanczos to around 400 lines at a push
2017-06-21 02:50:09 +01:00
paboyle
7e35286860
Simplified lanczos, added Eigen diagonalisation.
...
Curious if we can deprecate dependencly on BLAS.
Will see when we get 48^3 running on our BG/Q port
2017-06-21 02:26:03 +01:00
paboyle
0486ff8e79
Improved the lancos
2017-06-20 18:46:01 +01:00
1e8a2e1621
various compatibility fixes after merge
2017-06-20 17:24:55 +01:00
7587df831a
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
e9cc21900f
Block solver complete for staggered. Now stable on mass 0.003 and
...
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.
166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.
Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00
Azusa Yamaguchi
0a8faac271
Fix make tests compile
2017-06-19 22:54:18 +01:00
Azusa Yamaguchi
abc4de0fd2
No compile make tests fix
2017-06-19 22:03:03 +01:00
284ee194b1
JSON update
2017-06-19 14:38:15 +01:00
Azusa Yamaguchi
cfe3cd76d1
Block solver improvements
2017-06-19 14:04:21 +01:00
Azusa Yamaguchi
3fa5e3109f
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-19 14:01:44 +01:00
paboyle
8b7049f737
Improved detectino of usqcdInfo for plaq/linktr
2017-06-19 08:46:07 +01:00
paboyle
c85024683e
Merge branch 'feature/parallelio' into develop
2017-06-19 01:39:48 +01:00
paboyle
1300b0b04b
Update to enable multiple records per file more consistent with SciDAC.
...
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle
1d18d95d4f
Class name return
2017-06-18 00:13:03 +01:00
paboyle
ae39ec85a3
ComplexField defined
2017-06-18 00:12:48 +01:00
paboyle
b96daf53a0
Query tensor structures
2017-06-18 00:12:15 +01:00
paboyle
46879e1658
Complex defined in Impl even for gauge.
2017-06-18 00:11:45 +01:00
paboyle
ae4de94798
SciDAC I/O support
2017-06-18 00:11:23 +01:00