Peter Boyle
2d3737a133
O3, KNL
2017-06-30 10:15:59 +01:00
Peter Boyle
ac1f1838bc
KNL only
2017-06-30 10:15:32 +01:00
Guido Cossu
09d09d0fe5
Update README.md
2017-06-29 11:48:11 +01:00
Guido Cossu
bf630a6821
README file update
2017-06-29 11:42:25 +01:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9
Merge pull request #114 from nmeyer-ur/feature/arm-neon
...
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
paboyle
6f5a5cd9b3
Improved threaded comms benchmark
2017-06-28 23:27:02 +01:00
Nils Meyer
0933aeefd4
corrected Grid_neon.h
2017-06-28 20:22:22 +02:00
Peter Boyle
322f61acee
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-28 15:30:35 +01:00
Peter Boyle
08e04b9676
Better benchmarks
2017-06-28 15:30:06 +01:00
feaa2ac947
Merge branch 'feature/scalar-hmc-update' into develop
2017-06-28 12:46:18 +01:00
07de925127
minor scalar action fixes
2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268
moved file to correct folder
2017-06-27 21:39:15 +02:00
Nils Meyer
e43a8b6b8a
removed comments
2017-06-27 20:58:48 +02:00
Nils Meyer
bf729766dd
removed collision with QPX implementation
2017-06-27 20:32:24 +02:00
Guido Cossu
dafb351d38
Merge pull request #120 from paboyle/feature/scalar-hmc-update
...
Scalar HMC update.
I agree with the changes.
2017-06-27 16:23:14 +01:00
0b707b861c
Merge branch 'develop' into feature/scalar-hmc-update
2017-06-27 14:40:05 +01:00
15e87a4607
HDF5 IO fix
2017-06-27 14:39:27 +01:00
7d7220cbd7
scalar: lambda/4! convention
2017-06-27 14:38:45 +01:00
paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
0af740dc15
minor scalar HMC code improvement
2017-06-24 23:04:05 +01:00
d2e8372df3
SU(N) algebra fix (was not working)
2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e
Threaded calls to multiple communicators
2017-06-24 10:55:54 +01:00
4372d04ad4
Merge pull request #118 from Lanny91/hotfix/bgq
...
Hotfix/bgq
2017-06-23 16:59:27 +01:00
Lanny91
56abbdf4c2
AVX512 integer reduce fix (for non-intel compiler)
2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c
AVX2 fix
2017-06-23 11:03:12 +02:00
Lanny91
0440d4ce66
Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq
2017-06-22 17:09:42 +02:00
b22eab8c8b
Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons
2017-06-21 18:32:48 +01:00
paboyle
a7d56523ab
Merge branch 'feature/lanczos-simplify' into develop
2017-06-21 14:03:20 +01:00
paboyle
9e56c65730
Updated TODO list
2017-06-21 14:02:58 +01:00
paboyle
ef4f2b8c41
todo update
2017-06-21 09:22:20 +01:00
paboyle
e8b95bd35b
Clean up finished. Could shrink Lanczos to around 400 lines at a push
2017-06-21 02:50:09 +01:00
paboyle
7e35286860
Simplified lanczos, added Eigen diagonalisation.
...
Curious if we can deprecate dependencly on BLAS.
Will see when we get 48^3 running on our BG/Q port
2017-06-21 02:26:03 +01:00
paboyle
0486ff8e79
Improved the lancos
2017-06-20 18:46:01 +01:00
1e8a2e1621
various compatibility fixes after merge
2017-06-20 17:24:55 +01:00
7587df831a
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
e9cc21900f
Block solver complete for staggered. Now stable on mass 0.003 and
...
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.
166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.
Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00
Azusa Yamaguchi
0a8faac271
Fix make tests compile
2017-06-19 22:54:18 +01:00
Azusa Yamaguchi
abc4de0fd2
No compile make tests fix
2017-06-19 22:03:03 +01:00
b672717096
Test_serialiation update for JSON
2017-06-19 14:38:39 +01:00
284ee194b1
JSON update
2017-06-19 14:38:15 +01:00
Azusa Yamaguchi
cfe3cd76d1
Block solver improvements
2017-06-19 14:04:21 +01:00
Azusa Yamaguchi
3fa5e3109f
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-19 14:01:44 +01:00
paboyle
8b7049f737
Improved detectino of usqcdInfo for plaq/linktr
2017-06-19 08:46:07 +01:00
paboyle
c85024683e
Merge branch 'feature/parallelio' into develop
2017-06-19 01:39:48 +01:00
paboyle
1300b0b04b
Update to enable multiple records per file more consistent with SciDAC.
...
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle
e6d984b484
ILDG tests
2017-06-18 00:13:22 +01:00
paboyle
1d18d95d4f
Class name return
2017-06-18 00:13:03 +01:00
paboyle
ae39ec85a3
ComplexField defined
2017-06-18 00:12:48 +01:00
paboyle
b96daf53a0
Query tensor structures
2017-06-18 00:12:15 +01:00