Peter Boyle
bfb68e6f02
Merge pull request #130 from giltirn/gparity-handunroll
...
Gparity handunroll
2017-09-21 10:11:00 +01:00
Christopher Kelly
59bd1fe21b
Fix for 'perm' and 'local' not being set for hand-unrolled external-site Dslash, which caused incorrect behavior of G-parity kernel
2017-08-29 13:07:37 -07:00
Christopher Kelly
74af885d4e
Removed some no-longer-needed associated with G-parity hand unrolled kernel
2017-08-29 09:50:37 -04:00
paboyle
80c5bce5bb
Merge branch 'develop' into feature/multi-communicator
2017-08-25 20:21:26 +01:00
paboyle
f68b5de9c8
No compile fix on Clang
2017-08-25 19:35:21 +01:00
Christopher Kelly
f365a83fae
In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions
2017-08-25 14:24:11 -04:00
Peter Boyle
c289699d9a
updated from cambridge mpi3 shakeout
2017-08-25 11:41:01 +01:00
Peter Boyle
c3b1263e75
Benchmark prep
2017-08-25 09:25:54 +01:00
Christopher Kelly
34a9aeb331
Reduced number of if-statement evaluations in G-parity unrolled kernel
2017-08-24 13:53:50 -07:00
paboyle
5fa386ddc9
FFT test compile fixed
2017-08-24 10:17:52 +01:00
Christopher Kelly
ce5df177ee
Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl
2017-08-23 15:05:22 -04:00
Christopher Kelly
a0bb8e5b46
Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity
2017-08-23 14:44:40 -04:00
Christopher Kelly
46f88e6d72
G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary
2017-08-23 13:21:10 -04:00
Christopher Kelly
b61835c1a5
Added inplace version of intrinsic G-parity twist to hand-unrolled kernel
2017-08-23 12:33:48 -04:00
Azusa Yamaguchi
d9cd4f0273
Staggered multinode block cg debugged. Missing global sum.
...
Code stalls and resumes on KNL at cambridge. Curious.
CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
Christopher Kelly
061e48fd73
Replaced slow unpack-repack in G-parity BC twist with intrinsics version
2017-08-22 18:12:12 -04:00
Christopher Kelly
ab50145001
Implemented first, unoptimized version of hand-unrolled G-parity kernels
...
Improved Test_gparity
2017-08-22 17:12:25 -04:00
paboyle
a446d95c33
Trying to pass TeamCity and Travis
2017-08-20 01:10:50 +01:00
paboyle
be66e7dd95
Merge branch 'develop' into feature/multi-communicator
2017-08-19 23:12:38 +01:00
Peter Boyle
7d88198387
Merge branch 'develop' into feature/multi-communicator
2017-08-19 13:03:35 -04:00
Guido Cossu
4fe182e5a7
Added high level HMC support for overriding default SIMD lane decomposition
2017-08-06 10:46:19 +01:00
Peter Boyle
14d53e1c9e
Threaded MPI calls patches
2017-07-29 13:08:10 -04:00
Guido Cossu
c0485d799d
Explicit parameter declaration in the WilsonGauge test
2017-07-26 16:26:04 +01:00
Guido Cossu
7abc5613bd
Added smearing to the topological charge observable
2017-07-26 16:21:17 +01:00
Christopher Kelly
0f214ad427
Moved FourierAcceleratedGaugeFixer into Grid::QCD namespace and removed 'using namespace' directives from header
2017-07-21 11:13:51 -04:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
07de925127
minor scalar action fixes
2017-06-28 12:45:44 +01:00
7d7220cbd7
scalar: lambda/4! convention
2017-06-27 14:38:45 +01:00
paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
0af740dc15
minor scalar HMC code improvement
2017-06-24 23:04:05 +01:00
d2e8372df3
SU(N) algebra fix (was not working)
2017-06-24 23:03:39 +01:00
b22eab8c8b
Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons
2017-06-21 18:32:48 +01:00
paboyle
0486ff8e79
Improved the lancos
2017-06-20 18:46:01 +01:00
1e8a2e1621
various compatibility fixes after merge
2017-06-20 17:24:55 +01:00
7587df831a
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
0a8faac271
Fix make tests compile
2017-06-19 22:54:18 +01:00
Azusa Yamaguchi
3fa5e3109f
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-19 14:01:44 +01:00
paboyle
c85024683e
Merge branch 'feature/parallelio' into develop
2017-06-19 01:39:48 +01:00
paboyle
1300b0b04b
Update to enable multiple records per file more consistent with SciDAC.
...
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle
ae39ec85a3
ComplexField defined
2017-06-18 00:12:48 +01:00
paboyle
46879e1658
Complex defined in Impl even for gauge.
2017-06-18 00:11:45 +01:00
81b18f843a
Merge branch 'feature/scalar_adjointFT' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-16 17:59:55 +01:00
paboyle
3bfd1f13e6
I/O improvements
2017-06-11 23:14:10 +01:00
Azusa Yamaguchi
70ab598c96
Move gfix into utils
2017-06-08 22:22:23 +01:00
f6aa82b7f2
Merge branch 'develop' into feature/hadrons
2017-06-06 11:46:33 -05:00
0503c028be
Merge branch 'feature/qed-fvol' into feature/hadrons (non-trivial conflicts on scalar Impl)
...
# Conflicts:
# configure.ac
# lib/qcd/action/scalar/Scalar.h
2017-06-05 16:37:47 -05:00
Guido Cossu
7da4856e8e
Wilson flow with adaptive steps
2017-06-02 16:55:53 +01:00
Guido Cossu
aaf1e33a77
Adding adaptive integration in the WilsonFlow
2017-06-02 16:32:35 +01:00
paboyle
094c3d091a
Improved and RNG's now survive checkpoint
2017-06-02 00:38:58 +01:00
Guido Cossu
7c6cc85df6
Updating WilsonFlow test
2017-05-27 18:03:49 +01:00