1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-15 10:15:36 +00:00
Commit Graph

2157 Commits

Author SHA1 Message Date
Nils Meyer
4e907fef2c Merge remote-tracking branch 'grid/develop' into feature/arm-neon 2017-08-29 17:47:36 +02:00
Christopher Kelly
74af885d4e Removed some no-longer-needed associated with G-parity hand unrolled kernel 2017-08-29 09:50:37 -04:00
paboyle
4b4c2a715b fcntl.h needed 2017-08-26 11:38:04 +01:00
paboyle
54a5e6c1d0 Check if we get huge pages on linux. Larry Meadows piece of magic. 2017-08-25 22:36:08 +01:00
paboyle
80c5bce5bb Merge branch 'develop' into feature/multi-communicator 2017-08-25 20:21:26 +01:00
paboyle
f68b5de9c8 No compile fix on Clang 2017-08-25 19:35:21 +01:00
Christopher Kelly
f365a83fae In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions 2017-08-25 14:24:11 -04:00
Peter Boyle
c289699d9a updated from cambridge mpi3 shakeout 2017-08-25 11:41:01 +01:00
Peter Boyle
c3b1263e75 Benchmark prep 2017-08-25 09:25:54 +01:00
Christopher Kelly
34a9aeb331 Reduced number of if-statement evaluations in G-parity unrolled kernel 2017-08-24 13:53:50 -07:00
paboyle
5fa386ddc9 FFT test compile fixed 2017-08-24 10:17:52 +01:00
Christopher Kelly
ce5df177ee Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl 2017-08-23 15:05:22 -04:00
Christopher Kelly
a0bb8e5b46 Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity 2017-08-23 14:44:40 -04:00
Christopher Kelly
46f88e6d72 G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary 2017-08-23 13:21:10 -04:00
David Murphy
dd8f1ea189 Vectorized Mobius EOFA Dperp + shift operation 2017-08-23 13:17:26 -04:00
Christopher Kelly
b61835c1a5 Added inplace version of intrinsic G-parity twist to hand-unrolled kernel 2017-08-23 12:33:48 -04:00
Azusa Yamaguchi
d9cd4f0273 Staggered multinode block cg debugged. Missing global sum.
Code stalls and resumes on KNL at cambridge. Curious.

CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
David Murphy
459f70e8d4 Check-in of working Mobius EOFA class and tests 2017-08-22 22:38:30 -04:00
Christopher Kelly
061e48fd73 Replaced slow unpack-repack in G-parity BC twist with intrinsics version 2017-08-22 18:12:12 -04:00
Christopher Kelly
ab50145001 Implemented first, unoptimized version of hand-unrolled G-parity kernels
Improved Test_gparity
2017-08-22 17:12:25 -04:00
paboyle
b49bec0cec MAP_HUGETLB portability fix 2017-08-20 03:08:54 +01:00
paboyle
1cdf999668 Moving multicommunicator into mpi3 also for threading 2017-08-20 02:39:10 +01:00
paboyle
11062fb686 Comms none fail fix 2017-08-20 01:37:07 +01:00
paboyle
a446d95c33 Trying to pass TeamCity and Travis 2017-08-20 01:10:50 +01:00
paboyle
be66e7dd95 Merge branch 'develop' into feature/multi-communicator 2017-08-19 23:12:38 +01:00
Peter Boyle
0b0cf62193 Fix mpi 3 interface change 2017-08-19 13:18:50 -04:00
Peter Boyle
7d88198387 Merge branch 'develop' into feature/multi-communicator 2017-08-19 13:03:35 -04:00
Peter Boyle
2f619482b8 Enable blocking stencil send 2017-08-19 12:53:59 -04:00
Peter Boyle
d6472eda8d Use mmap 2017-08-19 12:53:18 -04:00
Peter Boyle
bcefdd7c4e Align both allocator calls to 2MB 2017-08-19 12:49:02 -04:00
Chulwoo Jung
0145685f96 Added Staggered Type Preconditioned operator 2017-08-18 01:44:31 -04:00
David Murphy
9d45fca8bc Implement MobiusEOFAFermioncache.cc 2017-08-17 23:45:36 -04:00
David Murphy
ac9e6b63c0 More re-import of Mobius EOFA 2017-08-17 19:28:53 -04:00
David Murphy
e140b3f802 Beginning to re-import Mobius EOFA 2017-08-16 23:36:23 -04:00
David Murphy
d9d3d30cc7 Minor clean-up 2017-08-16 20:57:51 -04:00
David Murphy
47a12ec7b5 Implement EOFA pseudofermion force and Shamir tests for G-parity and non G-parity cases 2017-08-16 19:50:08 -04:00
David Murphy
ec1e2f7a40 Add (mostly implemented) ExactOneFlavourRatio pseudofermion class and tests of Shamir heatbath and action 2017-08-16 12:38:59 -04:00
David Murphy
41f73ec083 Add ChronoForecast class for forecasting solutions across poles in the EOFA heatbath 2017-08-16 12:37:38 -04:00
Guido Cossu
fd367d8bfd Debugging the PointerCache 2017-08-16 09:42:57 +01:00
David Murphy
6d0786ff9d Typo fixes and check-in of G-parity action test for DWF 2017-08-15 22:47:00 -04:00
David Murphy
b7f93aeb4d Change CayleyFermion5D::SetCoefficientsInternal to virtual to allow overriding in derived EOFA classes 2017-08-15 14:18:51 -04:00
David Murphy
202a7fe900 Re-import DWF and abstract base EOFA fermion classes and tests 2017-08-15 13:36:08 -04:00
Chulwoo Jung
e73e4b4002 Minor changes fixes 2017-08-11 01:35:25 -04:00
Guido Cossu
8a3fe60a27 Added more asserts at grid creation time 2017-08-08 11:36:20 +01:00
Guido Cossu
44051aecd1 Checking for integer divisions in cartesian full 2017-08-08 10:31:12 +01:00
Guido Cossu
06e6f8de00 Check that the reduced dim is an integer 2017-08-08 10:22:12 +01:00
Chulwoo Jung
caa6605b43 Still tweaking memory saving routines in Lanczos 2017-08-07 00:01:04 -04:00
Chulwoo Jung
522c9248ae Merge branch 'develop' of https://github.com/paboyle/Grid into feature/Lanczos 2017-08-06 23:58:21 -04:00
Guido Cossu
4fe182e5a7 Added high level HMC support for overriding default SIMD lane decomposition 2017-08-06 10:46:19 +01:00
Guido Cossu
175f393f9d Binary IO error checking 2017-08-04 12:14:10 +01:00
Christopher Kelly
7d867a8134 Merge branch 'develop' into feature/CG-reliable-update 2017-08-02 09:48:04 -04:00
Christopher Kelly
9939b267d2 Added switching to fallback linear operator in reliable update CG, and added recalculation of b parameter on update. 2017-07-31 13:39:44 -04:00
Peter Boyle
14d53e1c9e Threaded MPI calls patches 2017-07-29 13:08:10 -04:00
Chulwoo Jung
191fbf85fc Added ImplicitlyRestartedLanczosCJ to Algorithms.h 2017-07-28 15:33:59 -04:00
Guido Cossu
8bd869da37 Correcting a bug in the IO routines 2017-07-27 15:12:50 +01:00
Guido Cossu
c0485d799d Explicit parameter declaration in the WilsonGauge test 2017-07-26 16:26:04 +01:00
Guido Cossu
7abc5613bd Added smearing to the topological charge observable 2017-07-26 16:21:17 +01:00
Guido Cossu
a4b7dddb67 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-07-26 12:07:38 +01:00
Guido Cossu
5696781862 Debug error in Tensor mult 2017-07-26 12:07:34 +01:00
Christopher Kelly
9b6cde173f Merge branch 'feature/CG-reliable-update' into ckelly_develop 2017-07-25 11:51:08 -04:00
Christopher Kelly
9f280b82c4 Added mixed-precision CG with reliable updates 2017-07-25 11:30:41 -04:00
Chulwoo Jung
93650f3a61 Adding back (temporarily) dense matrix routines until Lanczos is fininalized 2017-07-24 21:49:25 -04:00
Chulwoo Jung
cab4b4d063 Deleting old include file references 2017-07-24 20:51:31 -04:00
Chulwoo Jung
cf4b30b2dd re-adding ImplcitlyRestartedLanczos 2017-07-24 20:40:25 -04:00
Chulwoo Jung
c51d0b4078 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/Lanczos 2017-07-24 20:35:29 -04:00
Nils Meyer
7a53dc3715 Added integer reduce functionality 2017-07-24 11:12:59 +02:00
Christopher Kelly
0f214ad427 Moved FourierAcceleratedGaugeFixer into Grid::QCD namespace and removed 'using namespace' directives from header 2017-07-21 11:13:51 -04:00
Guido Cossu
9fa07eecde Merge branch 'develop' into feature/json-fix 2017-07-12 15:47:22 +01:00
azusayamaguchi
659d7d1a40 For test/solver
Fixed
2017-07-12 15:01:48 +01:00
Guido Cossu
f64fb7bd77 Fix gcc error on JSON compilation 2017-07-12 14:55:42 +01:00
Guido Cossu
2a35449b91 Merge branch 'develop' into feature/json-fix 2017-07-12 14:47:00 +01:00
Guido Cossu
184af5bd05 Added support for std::pair in the JSON serialiser 2017-07-12 14:44:53 +01:00
Guido Cossu
097c9637ee Fixed the JSON parsing error 2017-07-11 14:31:57 +01:00
azusayamaguchi
dc6f078246 fixed the header file for mpi3 2017-07-11 14:15:08 +01:00
Peter Boyle
40e119c61c NUMA improvements worth preserving from AMD EPYC tests 2017-07-08 22:27:11 -04:00
Guido Cossu
d9593c4b81 Merge branch 'develop' into feature/json-fix 2017-07-07 14:17:50 +01:00
paboyle
75dc7794b9 Working on Cori 2017-07-02 16:47:42 -07:00
paboyle
dee68fc728 IO working multiple nodes again. Strategy of all nodes writing metadata is unsafe.
Only one rank should do this. must identify this rank. Means pass communicator to the
Objects.
2017-07-02 23:33:48 +01:00
paboyle
57002924bc NERSC shakeout of this 2017-07-02 14:58:30 -07:00
Peter Boyle
a0be3f7330 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-30 10:53:50 +01:00
Peter Boyle
b5a6e4f1fd Best option for Xeon cache blocking set 2017-06-30 10:53:22 +01:00
Peter Boyle
7a788db3dc Guard first touch 2017-06-30 10:49:08 +01:00
Peter Boyle
f20eceb6cd First touch once per page in a threaded loop 2017-06-30 10:48:27 +01:00
Peter Boyle
38325ebbc6 Interleave code path; not enabled 2017-06-30 10:23:51 +01:00
Peter Boyle
ac1f1838bc KNL only 2017-06-30 10:15:32 +01:00
Guido Cossu
8859a151cc Small corrections to the NEON port 2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9 Merge pull request #114 from nmeyer-ur/feature/arm-neon
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
Nils Meyer
0933aeefd4 corrected Grid_neon.h 2017-06-28 20:22:22 +02:00
07de925127 minor scalar action fixes 2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268 moved file to correct folder 2017-06-27 21:39:15 +02:00
Nils Meyer
bf729766dd removed collision with QPX implementation 2017-06-27 20:32:24 +02:00
0b707b861c Merge branch 'develop' into feature/scalar-hmc-update 2017-06-27 14:40:05 +01:00
15e87a4607 HDF5 IO fix 2017-06-27 14:39:27 +01:00
7d7220cbd7 scalar: lambda/4! convention 2017-06-27 14:38:45 +01:00
paboyle
54e94360ad Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit 2017-06-24 23:10:24 +01:00
0af740dc15 minor scalar HMC code improvement 2017-06-24 23:04:05 +01:00
d2e8372df3 SU(N) algebra fix (was not working) 2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e Threaded calls to multiple communicators 2017-06-24 10:55:54 +01:00
paboyle
349d75e483 Precision fix 2017-06-23 02:57:59 -07:00
Lanny91
56abbdf4c2 AVX512 integer reduce fix (for non-intel compiler) 2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c AVX2 fix 2017-06-23 11:03:12 +02:00
paboyle
1feddf4ba6 const fixes 2017-06-22 19:32:41 +01:00
paboyle
e504260f3d Able to run a test job splitting into multiple MPI subdomains. 2017-06-22 18:53:11 +01:00
Lanny91
0440d4ce66 Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq 2017-06-22 17:09:42 +02:00
paboyle
5e4bea8f20 Benchmark DWF works 2017-06-22 08:38:54 +01:00
paboyle
6ebf9f15b7 Splitting communicators first cut 2017-06-22 08:14:34 +01:00
paboyle
b9104f3072 Block CG 2017-06-21 21:08:03 +01:00
b22eab8c8b Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons 2017-06-21 18:32:48 +01:00
paboyle
e8b95bd35b Clean up finished. Could shrink Lanczos to around 400 lines at a push 2017-06-21 02:50:09 +01:00
paboyle
7e35286860 Simplified lanczos, added Eigen diagonalisation.
Curious if we can deprecate dependencly on BLAS.
Will see when we get 48^3 running on our BG/Q port
2017-06-21 02:26:03 +01:00
paboyle
0486ff8e79 Improved the lancos 2017-06-20 18:46:01 +01:00
1e8a2e1621 various compatibility fixes after merge 2017-06-20 17:24:55 +01:00
7587df831a Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
e9cc21900f Block solver complete for staggered. Now stable on mass 0.003 and
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.

166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.

Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00
Azusa Yamaguchi
0a8faac271 Fix make tests compile 2017-06-19 22:54:18 +01:00
Azusa Yamaguchi
abc4de0fd2 No compile make tests fix 2017-06-19 22:03:03 +01:00
284ee194b1 JSON update 2017-06-19 14:38:15 +01:00
Azusa Yamaguchi
cfe3cd76d1 Block solver improvements 2017-06-19 14:04:21 +01:00
Azusa Yamaguchi
3fa5e3109f Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-19 14:01:44 +01:00
paboyle
8b7049f737 Improved detectino of usqcdInfo for plaq/linktr 2017-06-19 08:46:07 +01:00
paboyle
c85024683e Merge branch 'feature/parallelio' into develop 2017-06-19 01:39:48 +01:00
paboyle
1300b0b04b Update to enable multiple records per file more consistent with SciDAC.
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle
1d18d95d4f Class name return 2017-06-18 00:13:03 +01:00
paboyle
ae39ec85a3 ComplexField defined 2017-06-18 00:12:48 +01:00
paboyle
b96daf53a0 Query tensor structures 2017-06-18 00:12:15 +01:00
paboyle
46879e1658 Complex defined in Impl even for gauge. 2017-06-18 00:11:45 +01:00
paboyle
ae4de94798 SciDAC I/O support 2017-06-18 00:11:23 +01:00
paboyle
0ab555b4f5 SciDAC I/O and ILDG improvements 2017-06-18 00:11:02 +01:00
paboyle
8e9be9f84f Updates for SciDAC IO 2017-06-18 00:10:42 +01:00
paboyle
d572170170 Update for SciDAC 2017-06-18 00:10:20 +01:00
81b18f843a Merge branch 'feature/scalar_adjointFT' into feature/hadrons
# Conflicts:
#	lib/qcd/action/scalar/ScalarImpl.h
2017-06-16 17:59:55 +01:00
Lanny91
a833f88c32 Added missing SIMD integer reduction implementation for AVX, AVX-512, SSE4, IMCI 2017-06-16 15:58:47 +01:00
Lanny91
07b2c1b253 Placeholder precision change functions to allow Grid to compile with QPX (warning: no actual functionality) 2017-06-16 15:04:26 +01:00
Lanny91
735cbdb983 QPX Integer reduction (+ integer reduction test) 2017-06-14 10:55:10 +01:00
Lanny91
2ad54c5a02 QPX exchange support 2017-06-14 10:53:39 +01:00
Nils Meyer
3d04dc33c6 ARM neon intrinsics support 2017-06-13 13:26:59 +02:00
paboyle
91199a8ea0 openmpi is not const safe 2017-06-13 12:21:29 +01:00
paboyle
0494feec98 Libz dependency 2017-06-13 12:00:23 +01:00
paboyle
a16b1e134e gcc 4.9 fix 2017-06-13 10:48:43 +01:00
Chulwoo Jung
2f4cbeb4d5 Minor changes 2017-06-12 18:25:18 -04:00
paboyle
769ad578f5 Odd new error on G++ 49 on travis 2017-06-12 00:41:21 +01:00
paboyle
eaac0044b5 Compile fixes 2017-06-12 00:20:49 +01:00
paboyle
56042f002c New files 2017-06-11 23:19:20 +01:00
paboyle
3bfd1f13e6 I/O improvements 2017-06-11 23:14:10 +01:00
Azusa Yamaguchi
70ab598c96 Move gfix into utils 2017-06-08 22:22:23 +01:00
Azusa Yamaguchi
1d0ca65e28 Move Gfix into utils 2017-06-08 22:21:50 +01:00
Chulwoo Jung
fb7c4fb815 Recovering lapack interface without array allocation 2017-06-07 00:00:59 -04:00
Chulwoo Jung
00bb71e5af Checking in before reworking lapack interface 2017-06-06 16:26:41 -04:00
f6aa82b7f2 Merge branch 'develop' into feature/hadrons 2017-06-06 11:46:33 -05:00
Chulwoo Jung
cfed2c1ea0 Broken Lanczos. Going back to an older verion temporarily. 2017-06-06 12:14:45 -04:00