Christopher Kelly
74af885d4e
Removed some no-longer-needed associated with G-parity hand unrolled kernel
2017-08-29 09:50:37 -04:00
paboyle
4b4c2a715b
fcntl.h needed
2017-08-26 11:38:04 +01:00
paboyle
54a5e6c1d0
Check if we get huge pages on linux. Larry Meadows piece of magic.
2017-08-25 22:36:08 +01:00
paboyle
80c5bce5bb
Merge branch 'develop' into feature/multi-communicator
2017-08-25 20:21:26 +01:00
paboyle
f68b5de9c8
No compile fix on Clang
2017-08-25 19:35:21 +01:00
Christopher Kelly
f365a83fae
In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions
2017-08-25 14:24:11 -04:00
Peter Boyle
c289699d9a
updated from cambridge mpi3 shakeout
2017-08-25 11:41:01 +01:00
Peter Boyle
c3b1263e75
Benchmark prep
2017-08-25 09:25:54 +01:00
Christopher Kelly
34a9aeb331
Reduced number of if-statement evaluations in G-parity unrolled kernel
2017-08-24 13:53:50 -07:00
paboyle
5fa386ddc9
FFT test compile fixed
2017-08-24 10:17:52 +01:00
Christopher Kelly
ce5df177ee
Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl
2017-08-23 15:05:22 -04:00
Christopher Kelly
a0bb8e5b46
Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity
2017-08-23 14:44:40 -04:00
Christopher Kelly
46f88e6d72
G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary
2017-08-23 13:21:10 -04:00
David Murphy
dd8f1ea189
Vectorized Mobius EOFA Dperp + shift operation
2017-08-23 13:17:26 -04:00
Christopher Kelly
b61835c1a5
Added inplace version of intrinsic G-parity twist to hand-unrolled kernel
2017-08-23 12:33:48 -04:00
Azusa Yamaguchi
d9cd4f0273
Staggered multinode block cg debugged. Missing global sum.
...
Code stalls and resumes on KNL at cambridge. Curious.
CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
David Murphy
459f70e8d4
Check-in of working Mobius EOFA class and tests
2017-08-22 22:38:30 -04:00
Christopher Kelly
061e48fd73
Replaced slow unpack-repack in G-parity BC twist with intrinsics version
2017-08-22 18:12:12 -04:00
Christopher Kelly
ab50145001
Implemented first, unoptimized version of hand-unrolled G-parity kernels
...
Improved Test_gparity
2017-08-22 17:12:25 -04:00
paboyle
b49bec0cec
MAP_HUGETLB portability fix
2017-08-20 03:08:54 +01:00
paboyle
1cdf999668
Moving multicommunicator into mpi3 also for threading
2017-08-20 02:39:10 +01:00
paboyle
11062fb686
Comms none fail fix
2017-08-20 01:37:07 +01:00
paboyle
a446d95c33
Trying to pass TeamCity and Travis
2017-08-20 01:10:50 +01:00
paboyle
be66e7dd95
Merge branch 'develop' into feature/multi-communicator
2017-08-19 23:12:38 +01:00
Peter Boyle
0b0cf62193
Fix mpi 3 interface change
2017-08-19 13:18:50 -04:00
Peter Boyle
7d88198387
Merge branch 'develop' into feature/multi-communicator
2017-08-19 13:03:35 -04:00
Peter Boyle
2f619482b8
Enable blocking stencil send
2017-08-19 12:53:59 -04:00
Peter Boyle
d6472eda8d
Use mmap
2017-08-19 12:53:18 -04:00
Peter Boyle
bcefdd7c4e
Align both allocator calls to 2MB
2017-08-19 12:49:02 -04:00
Chulwoo Jung
0145685f96
Added Staggered Type Preconditioned operator
2017-08-18 01:44:31 -04:00
David Murphy
9d45fca8bc
Implement MobiusEOFAFermioncache.cc
2017-08-17 23:45:36 -04:00
David Murphy
ac9e6b63c0
More re-import of Mobius EOFA
2017-08-17 19:28:53 -04:00
David Murphy
e140b3f802
Beginning to re-import Mobius EOFA
2017-08-16 23:36:23 -04:00
David Murphy
d9d3d30cc7
Minor clean-up
2017-08-16 20:57:51 -04:00
David Murphy
47a12ec7b5
Implement EOFA pseudofermion force and Shamir tests for G-parity and non G-parity cases
2017-08-16 19:50:08 -04:00
David Murphy
ec1e2f7a40
Add (mostly implemented) ExactOneFlavourRatio pseudofermion class and tests of Shamir heatbath and action
2017-08-16 12:38:59 -04:00
David Murphy
41f73ec083
Add ChronoForecast class for forecasting solutions across poles in the EOFA heatbath
2017-08-16 12:37:38 -04:00
Guido Cossu
fd367d8bfd
Debugging the PointerCache
2017-08-16 09:42:57 +01:00
David Murphy
6d0786ff9d
Typo fixes and check-in of G-parity action test for DWF
2017-08-15 22:47:00 -04:00
David Murphy
b7f93aeb4d
Change CayleyFermion5D::SetCoefficientsInternal to virtual to allow overriding in derived EOFA classes
2017-08-15 14:18:51 -04:00
David Murphy
202a7fe900
Re-import DWF and abstract base EOFA fermion classes and tests
2017-08-15 13:36:08 -04:00
Chulwoo Jung
e73e4b4002
Minor changes fixes
2017-08-11 01:35:25 -04:00
Guido Cossu
8a3fe60a27
Added more asserts at grid creation time
2017-08-08 11:36:20 +01:00
Guido Cossu
44051aecd1
Checking for integer divisions in cartesian full
2017-08-08 10:31:12 +01:00
Guido Cossu
06e6f8de00
Check that the reduced dim is an integer
2017-08-08 10:22:12 +01:00
Chulwoo Jung
caa6605b43
Still tweaking memory saving routines in Lanczos
2017-08-07 00:01:04 -04:00
Chulwoo Jung
522c9248ae
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/Lanczos
2017-08-06 23:58:21 -04:00
Guido Cossu
4fe182e5a7
Added high level HMC support for overriding default SIMD lane decomposition
2017-08-06 10:46:19 +01:00
Guido Cossu
175f393f9d
Binary IO error checking
2017-08-04 12:14:10 +01:00
Christopher Kelly
7d867a8134
Merge branch 'develop' into feature/CG-reliable-update
2017-08-02 09:48:04 -04:00
Christopher Kelly
9939b267d2
Added switching to fallback linear operator in reliable update CG, and added recalculation of b parameter on update.
2017-07-31 13:39:44 -04:00
Peter Boyle
14d53e1c9e
Threaded MPI calls patches
2017-07-29 13:08:10 -04:00
Chulwoo Jung
191fbf85fc
Added ImplicitlyRestartedLanczosCJ to Algorithms.h
2017-07-28 15:33:59 -04:00
Guido Cossu
8bd869da37
Correcting a bug in the IO routines
2017-07-27 15:12:50 +01:00
Guido Cossu
c0485d799d
Explicit parameter declaration in the WilsonGauge test
2017-07-26 16:26:04 +01:00
Guido Cossu
7abc5613bd
Added smearing to the topological charge observable
2017-07-26 16:21:17 +01:00
Guido Cossu
a4b7dddb67
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-07-26 12:07:38 +01:00
Guido Cossu
5696781862
Debug error in Tensor mult
2017-07-26 12:07:34 +01:00
Christopher Kelly
9b6cde173f
Merge branch 'feature/CG-reliable-update' into ckelly_develop
2017-07-25 11:51:08 -04:00
Christopher Kelly
9f280b82c4
Added mixed-precision CG with reliable updates
2017-07-25 11:30:41 -04:00
Chulwoo Jung
93650f3a61
Adding back (temporarily) dense matrix routines until Lanczos is fininalized
2017-07-24 21:49:25 -04:00
Chulwoo Jung
cab4b4d063
Deleting old include file references
2017-07-24 20:51:31 -04:00
Chulwoo Jung
cf4b30b2dd
re-adding ImplcitlyRestartedLanczos
2017-07-24 20:40:25 -04:00
Chulwoo Jung
c51d0b4078
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/Lanczos
2017-07-24 20:35:29 -04:00
Nils Meyer
7a53dc3715
Added integer reduce functionality
2017-07-24 11:12:59 +02:00
Christopher Kelly
0f214ad427
Moved FourierAcceleratedGaugeFixer into Grid::QCD namespace and removed 'using namespace' directives from header
2017-07-21 11:13:51 -04:00
Guido Cossu
9fa07eecde
Merge branch 'develop' into feature/json-fix
2017-07-12 15:47:22 +01:00
azusayamaguchi
659d7d1a40
For test/solver
...
Fixed
2017-07-12 15:01:48 +01:00
Guido Cossu
f64fb7bd77
Fix gcc error on JSON compilation
2017-07-12 14:55:42 +01:00
Guido Cossu
2a35449b91
Merge branch 'develop' into feature/json-fix
2017-07-12 14:47:00 +01:00
Guido Cossu
184af5bd05
Added support for std::pair in the JSON serialiser
2017-07-12 14:44:53 +01:00
Guido Cossu
097c9637ee
Fixed the JSON parsing error
2017-07-11 14:31:57 +01:00
azusayamaguchi
dc6f078246
fixed the header file for mpi3
2017-07-11 14:15:08 +01:00
Peter Boyle
40e119c61c
NUMA improvements worth preserving from AMD EPYC tests
2017-07-08 22:27:11 -04:00
Guido Cossu
d9593c4b81
Merge branch 'develop' into feature/json-fix
2017-07-07 14:17:50 +01:00
paboyle
75dc7794b9
Working on Cori
2017-07-02 16:47:42 -07:00
paboyle
dee68fc728
IO working multiple nodes again. Strategy of all nodes writing metadata is unsafe.
...
Only one rank should do this. must identify this rank. Means pass communicator to the
Objects.
2017-07-02 23:33:48 +01:00
paboyle
57002924bc
NERSC shakeout of this
2017-07-02 14:58:30 -07:00
Peter Boyle
a0be3f7330
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-30 10:53:50 +01:00
Peter Boyle
b5a6e4f1fd
Best option for Xeon cache blocking set
2017-06-30 10:53:22 +01:00
Peter Boyle
7a788db3dc
Guard first touch
2017-06-30 10:49:08 +01:00
Peter Boyle
f20eceb6cd
First touch once per page in a threaded loop
2017-06-30 10:48:27 +01:00
Peter Boyle
38325ebbc6
Interleave code path; not enabled
2017-06-30 10:23:51 +01:00
Peter Boyle
ac1f1838bc
KNL only
2017-06-30 10:15:32 +01:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9
Merge pull request #114 from nmeyer-ur/feature/arm-neon
...
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
Nils Meyer
0933aeefd4
corrected Grid_neon.h
2017-06-28 20:22:22 +02:00
07de925127
minor scalar action fixes
2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268
moved file to correct folder
2017-06-27 21:39:15 +02:00
Nils Meyer
bf729766dd
removed collision with QPX implementation
2017-06-27 20:32:24 +02:00
0b707b861c
Merge branch 'develop' into feature/scalar-hmc-update
2017-06-27 14:40:05 +01:00
15e87a4607
HDF5 IO fix
2017-06-27 14:39:27 +01:00
7d7220cbd7
scalar: lambda/4! convention
2017-06-27 14:38:45 +01:00
paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
0af740dc15
minor scalar HMC code improvement
2017-06-24 23:04:05 +01:00
d2e8372df3
SU(N) algebra fix (was not working)
2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e
Threaded calls to multiple communicators
2017-06-24 10:55:54 +01:00
paboyle
349d75e483
Precision fix
2017-06-23 02:57:59 -07:00
Lanny91
56abbdf4c2
AVX512 integer reduce fix (for non-intel compiler)
2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c
AVX2 fix
2017-06-23 11:03:12 +02:00