bf58557fb1
Block compressed Lanczos
2017-10-10 14:15:11 +01:00
d54807b8c0
MPIT works with split grid now
2017-10-02 23:14:56 +01:00
946a8671b9
Merge pull request #129 from djm2131/feature/eofa
...
Add support for DWF with the exact one flavor algorithm
2017-09-21 10:15:21 +01:00
bfb68e6f02
Merge pull request #130 from giltirn/gparity-handunroll
...
Gparity handunroll
2017-09-21 10:11:00 +01:00
59bd1fe21b
Fix for 'perm' and 'local' not being set for hand-unrolled external-site Dslash, which caused incorrect behavior of G-parity kernel
2017-08-29 13:07:37 -07:00
74af885d4e
Removed some no-longer-needed associated with G-parity hand unrolled kernel
2017-08-29 09:50:37 -04:00
80c5bce5bb
Merge branch 'develop' into feature/multi-communicator
2017-08-25 20:21:26 +01:00
f68b5de9c8
No compile fix on Clang
2017-08-25 19:35:21 +01:00
f365a83fae
In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions
2017-08-25 14:24:11 -04:00
c289699d9a
updated from cambridge mpi3 shakeout
2017-08-25 11:41:01 +01:00
c3b1263e75
Benchmark prep
2017-08-25 09:25:54 +01:00
34a9aeb331
Reduced number of if-statement evaluations in G-parity unrolled kernel
2017-08-24 13:53:50 -07:00
ce5df177ee
Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl
2017-08-23 15:05:22 -04:00
a0bb8e5b46
Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity
2017-08-23 14:44:40 -04:00
46f88e6d72
G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary
2017-08-23 13:21:10 -04:00
dd8f1ea189
Vectorized Mobius EOFA Dperp + shift operation
2017-08-23 13:17:26 -04:00
b61835c1a5
Added inplace version of intrinsic G-parity twist to hand-unrolled kernel
2017-08-23 12:33:48 -04:00
d9cd4f0273
Staggered multinode block cg debugged. Missing global sum.
...
Code stalls and resumes on KNL at cambridge. Curious.
CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
459f70e8d4
Check-in of working Mobius EOFA class and tests
2017-08-22 22:38:30 -04:00
061e48fd73
Replaced slow unpack-repack in G-parity BC twist with intrinsics version
2017-08-22 18:12:12 -04:00
ab50145001
Implemented first, unoptimized version of hand-unrolled G-parity kernels
...
Improved Test_gparity
2017-08-22 17:12:25 -04:00
a446d95c33
Trying to pass TeamCity and Travis
2017-08-20 01:10:50 +01:00
9d45fca8bc
Implement MobiusEOFAFermioncache.cc
2017-08-17 23:45:36 -04:00
ac9e6b63c0
More re-import of Mobius EOFA
2017-08-17 19:28:53 -04:00
e140b3f802
Beginning to re-import Mobius EOFA
2017-08-16 23:36:23 -04:00
d9d3d30cc7
Minor clean-up
2017-08-16 20:57:51 -04:00
6d0786ff9d
Typo fixes and check-in of G-parity action test for DWF
2017-08-15 22:47:00 -04:00
b7f93aeb4d
Change CayleyFermion5D::SetCoefficientsInternal to virtual to allow overriding in derived EOFA classes
2017-08-15 14:18:51 -04:00
202a7fe900
Re-import DWF and abstract base EOFA fermion classes and tests
2017-08-15 13:36:08 -04:00
14d53e1c9e
Threaded MPI calls patches
2017-07-29 13:08:10 -04:00
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
7587df831a
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
46879e1658
Complex defined in Impl even for gauge.
2017-06-18 00:11:45 +01:00
0503c028be
Merge branch 'feature/qed-fvol' into feature/hadrons (non-trivial conflicts on scalar Impl)
...
# Conflicts:
# configure.ac
# lib/qcd/action/scalar/Scalar.h
2017-06-05 16:37:47 -05:00
9c12c37aaf
Confirming the fix on the complex boundary conditions
2017-05-09 08:41:29 +01:00
529e78d43f
Restart the v0.7.0 release
2017-05-08 18:20:04 +01:00
2439999ec8
Warning elimination; drop to -O2 on G++ bad versions
2017-05-06 14:44:49 +01:00
1d96f662e3
Fixed 4d fermion gparity force. Put strong tests on make check force tests
2017-05-06 00:46:31 +01:00
20999c1370
Merge branch 'develop' into feature/hmc_generalise
2017-05-05 12:47:17 +01:00
78ef10e60f
Mobius force improvement
2017-05-04 19:53:21 +01:00
90f6bc16bb
No compile clang fix
2017-05-04 12:15:06 +01:00
422cdf4979
Some checks
2017-05-03 18:37:38 -04:00
38db174f3b
Print statement
2017-05-03 18:25:26 -04:00
4063238943
Adding HMC test file example for Mobius + smearing
2017-05-01 13:44:00 +01:00
3344788fa1
Merge branch 'develop' into feature/hmc_generalise
2017-05-01 12:13:56 +01:00
99220f6531
Fixes and better timing
2017-04-26 17:24:11 -04:00
f8797e1e3e
bug fix. works now and great face performance
2017-04-26 03:14:02 -04:00
fd1eb7de13
Clean implementation of the exterior faces listing only those points on the boudary
2017-04-26 02:34:52 -04:00
2ce898efa3
Pretty code
2017-04-26 02:34:25 -04:00
ab66bac4e6
Think I'm getting on top of the reduced cost exterior precomputed list of links
2017-04-25 08:50:26 +01:00