Peter Boyle
b73bd151bb
Switch off counters by default
2017-06-30 10:16:35 +01:00
Peter Boyle
694b305cab
Update to reporting
2017-06-30 10:16:13 +01:00
Peter Boyle
2d3737a133
O3, KNL
2017-06-30 10:15:59 +01:00
Peter Boyle
ac1f1838bc
KNL only
2017-06-30 10:15:32 +01:00
Guido Cossu
09d09d0fe5
Update README.md
2017-06-29 11:48:11 +01:00
Guido Cossu
bf630a6821
README file update
2017-06-29 11:42:25 +01:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9
Merge pull request #114 from nmeyer-ur/feature/arm-neon
...
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
paboyle
6f5a5cd9b3
Improved threaded comms benchmark
2017-06-28 23:27:02 +01:00
Nils Meyer
0933aeefd4
corrected Grid_neon.h
2017-06-28 20:22:22 +02:00
Peter Boyle
322f61acee
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-28 15:30:35 +01:00
Peter Boyle
08e04b9676
Better benchmarks
2017-06-28 15:30:06 +01:00
feaa2ac947
Merge branch 'feature/scalar-hmc-update' into develop
2017-06-28 12:46:18 +01:00
07de925127
minor scalar action fixes
2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268
moved file to correct folder
2017-06-27 21:39:15 +02:00
Nils Meyer
e43a8b6b8a
removed comments
2017-06-27 20:58:48 +02:00
Nils Meyer
bf729766dd
removed collision with QPX implementation
2017-06-27 20:32:24 +02:00
Guido Cossu
dafb351d38
Merge pull request #120 from paboyle/feature/scalar-hmc-update
...
Scalar HMC update.
I agree with the changes.
2017-06-27 16:23:14 +01:00
0b707b861c
Merge branch 'develop' into feature/scalar-hmc-update
2017-06-27 14:40:05 +01:00
15e87a4607
HDF5 IO fix
2017-06-27 14:39:27 +01:00
7d7220cbd7
scalar: lambda/4! convention
2017-06-27 14:38:45 +01:00
paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
0af740dc15
minor scalar HMC code improvement
2017-06-24 23:04:05 +01:00
d2e8372df3
SU(N) algebra fix (was not working)
2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e
Threaded calls to multiple communicators
2017-06-24 10:55:54 +01:00
paboyle
4a29ab0d0a
Merge branch 'feature/dwf-multirhs' of https://github.com/paboyle/Grid into feature/dwf-multirhs
2017-06-23 23:10:43 +01:00
paboyle
0165bcb58e
Added an update to TODO list
2017-06-23 23:10:24 +01:00
4372d04ad4
Merge pull request #118 from Lanny91/hotfix/bgq
...
Hotfix/bgq
2017-06-23 16:59:27 +01:00
paboyle
349d75e483
Precision fix
2017-06-23 02:57:59 -07:00
Lanny91
56abbdf4c2
AVX512 integer reduce fix (for non-intel compiler)
2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c
AVX2 fix
2017-06-23 11:03:12 +02:00
paboyle
e51475703a
Ticking off lots on the TODO list
2017-06-23 09:42:21 +01:00
paboyle
1feddf4ba6
const fixes
2017-06-22 19:32:41 +01:00
paboyle
600d7ddc2e
Proof of concept : Multi RHS solver, running independent solves on different ranks
2017-06-22 18:54:34 +01:00
paboyle
e504260f3d
Able to run a test job splitting into multiple MPI subdomains.
2017-06-22 18:53:11 +01:00
Lanny91
0440d4ce66
Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq
2017-06-22 17:09:42 +02:00
paboyle
5e4bea8f20
Benchmark DWF works
2017-06-22 08:38:54 +01:00
paboyle
6ebf9f15b7
Splitting communicators first cut
2017-06-22 08:14:34 +01:00
paboyle
1d7aa673a4
Include BlockCG by default
2017-06-21 21:08:53 +01:00
paboyle
b9104f3072
Block CG
2017-06-21 21:08:03 +01:00
b22eab8c8b
Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons
2017-06-21 18:32:48 +01:00
paboyle
a7d56523ab
Merge branch 'feature/lanczos-simplify' into develop
2017-06-21 14:03:20 +01:00
paboyle
9e56c65730
Updated TODO list
2017-06-21 14:02:58 +01:00
paboyle
ef4f2b8c41
todo update
2017-06-21 09:22:20 +01:00
paboyle
e8b95bd35b
Clean up finished. Could shrink Lanczos to around 400 lines at a push
2017-06-21 02:50:09 +01:00
paboyle
7e35286860
Simplified lanczos, added Eigen diagonalisation.
...
Curious if we can deprecate dependencly on BLAS.
Will see when we get 48^3 running on our BG/Q port
2017-06-21 02:26:03 +01:00
paboyle
0486ff8e79
Improved the lancos
2017-06-20 18:46:01 +01:00
1e8a2e1621
various compatibility fixes after merge
2017-06-20 17:24:55 +01:00
7587df831a
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
e9cc21900f
Block solver complete for staggered. Now stable on mass 0.003 and
...
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.
166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.
Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00