paboyle
4b4c2a715b
fcntl.h needed
2017-08-26 11:38:04 +01:00
paboyle
54a5e6c1d0
Check if we get huge pages on linux. Larry Meadows piece of magic.
2017-08-25 22:36:08 +01:00
paboyle
80c5bce5bb
Merge branch 'develop' into feature/multi-communicator
2017-08-25 20:21:26 +01:00
paboyle
f68b5de9c8
No compile fix on Clang
2017-08-25 19:35:21 +01:00
Christopher Kelly
f365a83fae
In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions
2017-08-25 14:24:11 -04:00
Peter Boyle
c289699d9a
updated from cambridge mpi3 shakeout
2017-08-25 11:41:01 +01:00
Peter Boyle
c3b1263e75
Benchmark prep
2017-08-25 09:25:54 +01:00
Christopher Kelly
34a9aeb331
Reduced number of if-statement evaluations in G-parity unrolled kernel
2017-08-24 13:53:50 -07:00
21b02760c3
Merge branch 'develop' into feature/hadrons
2017-08-24 17:05:45 +01:00
paboyle
5fa386ddc9
FFT test compile fixed
2017-08-24 10:17:52 +01:00
Christopher Kelly
ce5df177ee
Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl
2017-08-23 15:05:22 -04:00
Christopher Kelly
a0bb8e5b46
Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity
2017-08-23 14:44:40 -04:00
Christopher Kelly
46f88e6d72
G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary
2017-08-23 13:21:10 -04:00
David Murphy
dd8f1ea189
Vectorized Mobius EOFA Dperp + shift operation
2017-08-23 13:17:26 -04:00
Christopher Kelly
b61835c1a5
Added inplace version of intrinsic G-parity twist to hand-unrolled kernel
2017-08-23 12:33:48 -04:00
Azusa Yamaguchi
d9cd4f0273
Staggered multinode block cg debugged. Missing global sum.
...
Code stalls and resumes on KNL at cambridge. Curious.
CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
David Murphy
459f70e8d4
Check-in of working Mobius EOFA class and tests
2017-08-22 22:38:30 -04:00
Christopher Kelly
061e48fd73
Replaced slow unpack-repack in G-parity BC twist with intrinsics version
2017-08-22 18:12:12 -04:00
Christopher Kelly
ab50145001
Implemented first, unoptimized version of hand-unrolled G-parity kernels
...
Improved Test_gparity
2017-08-22 17:12:25 -04:00
paboyle
b49bec0cec
MAP_HUGETLB portability fix
2017-08-20 03:08:54 +01:00
paboyle
1cdf999668
Moving multicommunicator into mpi3 also for threading
2017-08-20 02:39:10 +01:00
paboyle
11062fb686
Comms none fail fix
2017-08-20 01:37:07 +01:00
paboyle
a446d95c33
Trying to pass TeamCity and Travis
2017-08-20 01:10:50 +01:00
paboyle
be66e7dd95
Merge branch 'develop' into feature/multi-communicator
2017-08-19 23:12:38 +01:00
Peter Boyle
0b0cf62193
Fix mpi 3 interface change
2017-08-19 13:18:50 -04:00
Peter Boyle
7d88198387
Merge branch 'develop' into feature/multi-communicator
2017-08-19 13:03:35 -04:00
Peter Boyle
2f619482b8
Enable blocking stencil send
2017-08-19 12:53:59 -04:00
Peter Boyle
d6472eda8d
Use mmap
2017-08-19 12:53:18 -04:00
Peter Boyle
bcefdd7c4e
Align both allocator calls to 2MB
2017-08-19 12:49:02 -04:00
David Murphy
9d45fca8bc
Implement MobiusEOFAFermioncache.cc
2017-08-17 23:45:36 -04:00
David Murphy
ac9e6b63c0
More re-import of Mobius EOFA
2017-08-17 19:28:53 -04:00
David Murphy
e140b3f802
Beginning to re-import Mobius EOFA
2017-08-16 23:36:23 -04:00
David Murphy
d9d3d30cc7
Minor clean-up
2017-08-16 20:57:51 -04:00
David Murphy
47a12ec7b5
Implement EOFA pseudofermion force and Shamir tests for G-parity and non G-parity cases
2017-08-16 19:50:08 -04:00
David Murphy
ec1e2f7a40
Add (mostly implemented) ExactOneFlavourRatio pseudofermion class and tests of Shamir heatbath and action
2017-08-16 12:38:59 -04:00
David Murphy
41f73ec083
Add ChronoForecast class for forecasting solutions across poles in the EOFA heatbath
2017-08-16 12:37:38 -04:00
Guido Cossu
fd367d8bfd
Debugging the PointerCache
2017-08-16 09:42:57 +01:00
David Murphy
6d0786ff9d
Typo fixes and check-in of G-parity action test for DWF
2017-08-15 22:47:00 -04:00
David Murphy
b7f93aeb4d
Change CayleyFermion5D::SetCoefficientsInternal to virtual to allow overriding in derived EOFA classes
2017-08-15 14:18:51 -04:00
David Murphy
202a7fe900
Re-import DWF and abstract base EOFA fermion classes and tests
2017-08-15 13:36:08 -04:00
Guido Cossu
8d168ded4a
Correction of the dagger version of the Clover
2017-08-15 10:50:44 +01:00
Guido Cossu
8a3fe60a27
Added more asserts at grid creation time
2017-08-08 11:36:20 +01:00
Guido Cossu
44051aecd1
Checking for integer divisions in cartesian full
2017-08-08 10:31:12 +01:00
Guido Cossu
06e6f8de00
Check that the reduced dim is an integer
2017-08-08 10:22:12 +01:00
Guido Cossu
4fe182e5a7
Added high level HMC support for overriding default SIMD lane decomposition
2017-08-06 10:46:19 +01:00
Guido Cossu
75ee6cfc86
Debugging the Clover term
2017-08-04 16:08:07 +01:00
Guido Cossu
fde71c3c52
Merge branch 'develop' into feature/clover
2017-08-04 12:19:57 +01:00
Guido Cossu
175f393f9d
Binary IO error checking
2017-08-04 12:14:10 +01:00
Christopher Kelly
7d867a8134
Merge branch 'develop' into feature/CG-reliable-update
2017-08-02 09:48:04 -04:00
Christopher Kelly
9939b267d2
Added switching to fallback linear operator in reliable update CG, and added recalculation of b parameter on update.
2017-07-31 13:39:44 -04:00
Lanny91
67b34e5789
Modified conserved current 5th dimension loop for compatibility with 5D vectorisation.
2017-07-31 11:35:01 +01:00
Peter Boyle
14d53e1c9e
Threaded MPI calls patches
2017-07-29 13:08:10 -04:00
Guido Cossu
8bd869da37
Correcting a bug in the IO routines
2017-07-27 15:12:50 +01:00
Guido Cossu
c0485d799d
Explicit parameter declaration in the WilsonGauge test
2017-07-26 16:26:04 +01:00
Guido Cossu
7abc5613bd
Added smearing to the topological charge observable
2017-07-26 16:21:17 +01:00
Guido Cossu
a4b7dddb67
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-07-26 12:07:38 +01:00
Guido Cossu
5696781862
Debug error in Tensor mult
2017-07-26 12:07:34 +01:00
Christopher Kelly
9b6cde173f
Merge branch 'feature/CG-reliable-update' into ckelly_develop
2017-07-25 11:51:08 -04:00
Christopher Kelly
9f280b82c4
Added mixed-precision CG with reliable updates
2017-07-25 11:30:41 -04:00
Nils Meyer
7a53dc3715
Added integer reduce functionality
2017-07-24 11:12:59 +02:00
Christopher Kelly
0f214ad427
Moved FourierAcceleratedGaugeFixer into Grid::QCD namespace and removed 'using namespace' directives from header
2017-07-21 11:13:51 -04:00
Guido Cossu
9fa07eecde
Merge branch 'develop' into feature/json-fix
2017-07-12 15:47:22 +01:00
azusayamaguchi
659d7d1a40
For test/solver
...
Fixed
2017-07-12 15:01:48 +01:00
Guido Cossu
f64fb7bd77
Fix gcc error on JSON compilation
2017-07-12 14:55:42 +01:00
Guido Cossu
2a35449b91
Merge branch 'develop' into feature/json-fix
2017-07-12 14:47:00 +01:00
Guido Cossu
184af5bd05
Added support for std::pair in the JSON serialiser
2017-07-12 14:44:53 +01:00
Guido Cossu
097c9637ee
Fixed the JSON parsing error
2017-07-11 14:31:57 +01:00
azusayamaguchi
dc6f078246
fixed the header file for mpi3
2017-07-11 14:15:08 +01:00
Peter Boyle
40e119c61c
NUMA improvements worth preserving from AMD EPYC tests
2017-07-08 22:27:11 -04:00
Guido Cossu
d9593c4b81
Merge branch 'develop' into feature/json-fix
2017-07-07 14:17:50 +01:00
paboyle
75dc7794b9
Working on Cori
2017-07-02 16:47:42 -07:00
paboyle
dee68fc728
IO working multiple nodes again. Strategy of all nodes writing metadata is unsafe.
...
Only one rank should do this. must identify this rank. Means pass communicator to the
Objects.
2017-07-02 23:33:48 +01:00
paboyle
57002924bc
NERSC shakeout of this
2017-07-02 14:58:30 -07:00
Peter Boyle
a0be3f7330
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-30 10:53:50 +01:00
Peter Boyle
b5a6e4f1fd
Best option for Xeon cache blocking set
2017-06-30 10:53:22 +01:00
Peter Boyle
7a788db3dc
Guard first touch
2017-06-30 10:49:08 +01:00
Peter Boyle
f20eceb6cd
First touch once per page in a threaded loop
2017-06-30 10:48:27 +01:00
Peter Boyle
38325ebbc6
Interleave code path; not enabled
2017-06-30 10:23:51 +01:00
Peter Boyle
ac1f1838bc
KNL only
2017-06-30 10:15:32 +01:00
Guido Cossu
8859a151cc
Small corrections to the NEON port
2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9
Merge pull request #114 from nmeyer-ur/feature/arm-neon
...
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
Nils Meyer
0933aeefd4
corrected Grid_neon.h
2017-06-28 20:22:22 +02:00
07de925127
minor scalar action fixes
2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268
moved file to correct folder
2017-06-27 21:39:15 +02:00
Nils Meyer
bf729766dd
removed collision with QPX implementation
2017-06-27 20:32:24 +02:00
0b707b861c
Merge branch 'develop' into feature/scalar-hmc-update
2017-06-27 14:40:05 +01:00
15e87a4607
HDF5 IO fix
2017-06-27 14:39:27 +01:00
7d7220cbd7
scalar: lambda/4! convention
2017-06-27 14:38:45 +01:00
Lanny91
7d2d5e8d3d
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/hadrons
2017-06-26 15:19:46 +01:00
paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
0af740dc15
minor scalar HMC code improvement
2017-06-24 23:04:05 +01:00
d2e8372df3
SU(N) algebra fix (was not working)
2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e
Threaded calls to multiple communicators
2017-06-24 10:55:54 +01:00
paboyle
349d75e483
Precision fix
2017-06-23 02:57:59 -07:00
Lanny91
56abbdf4c2
AVX512 integer reduce fix (for non-intel compiler)
2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c
AVX2 fix
2017-06-23 11:03:12 +02:00
paboyle
1feddf4ba6
const fixes
2017-06-22 19:32:41 +01:00
paboyle
e504260f3d
Able to run a test job splitting into multiple MPI subdomains.
2017-06-22 18:53:11 +01:00
Lanny91
0440d4ce66
Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq
2017-06-22 17:09:42 +02:00
Lanny91
c11d69787e
Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon
...
# Conflicts:
# extras/Hadrons/Modules.hpp
# extras/Hadrons/Modules/MFermion/GaugeProp.hpp
# extras/Hadrons/modules.inc
# tests/hadrons/Test_hadrons.hpp
# tests/hadrons/Test_hadrons_meson_3pt.cc
2017-06-22 16:26:31 +02:00
paboyle
5e4bea8f20
Benchmark DWF works
2017-06-22 08:38:54 +01:00
paboyle
6ebf9f15b7
Splitting communicators first cut
2017-06-22 08:14:34 +01:00
paboyle
b9104f3072
Block CG
2017-06-21 21:08:03 +01:00
b22eab8c8b
Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons
2017-06-21 18:32:48 +01:00
paboyle
e8b95bd35b
Clean up finished. Could shrink Lanczos to around 400 lines at a push
2017-06-21 02:50:09 +01:00
paboyle
7e35286860
Simplified lanczos, added Eigen diagonalisation.
...
Curious if we can deprecate dependencly on BLAS.
Will see when we get 48^3 running on our BG/Q port
2017-06-21 02:26:03 +01:00
paboyle
0486ff8e79
Improved the lancos
2017-06-20 18:46:01 +01:00
1e8a2e1621
various compatibility fixes after merge
2017-06-20 17:24:55 +01:00
7587df831a
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
e9cc21900f
Block solver complete for staggered. Now stable on mass 0.003 and
...
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.
166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.
Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00
Azusa Yamaguchi
0a8faac271
Fix make tests compile
2017-06-19 22:54:18 +01:00
Azusa Yamaguchi
abc4de0fd2
No compile make tests fix
2017-06-19 22:03:03 +01:00
284ee194b1
JSON update
2017-06-19 14:38:15 +01:00
Azusa Yamaguchi
cfe3cd76d1
Block solver improvements
2017-06-19 14:04:21 +01:00
Azusa Yamaguchi
3fa5e3109f
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-06-19 14:01:44 +01:00
paboyle
8b7049f737
Improved detectino of usqcdInfo for plaq/linktr
2017-06-19 08:46:07 +01:00
paboyle
c85024683e
Merge branch 'feature/parallelio' into develop
2017-06-19 01:39:48 +01:00
paboyle
1300b0b04b
Update to enable multiple records per file more consistent with SciDAC.
...
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle
1d18d95d4f
Class name return
2017-06-18 00:13:03 +01:00
paboyle
ae39ec85a3
ComplexField defined
2017-06-18 00:12:48 +01:00
paboyle
b96daf53a0
Query tensor structures
2017-06-18 00:12:15 +01:00
paboyle
46879e1658
Complex defined in Impl even for gauge.
2017-06-18 00:11:45 +01:00
paboyle
ae4de94798
SciDAC I/O support
2017-06-18 00:11:23 +01:00
paboyle
0ab555b4f5
SciDAC I/O and ILDG improvements
2017-06-18 00:11:02 +01:00
paboyle
8e9be9f84f
Updates for SciDAC IO
2017-06-18 00:10:42 +01:00
paboyle
d572170170
Update for SciDAC
2017-06-18 00:10:20 +01:00
81b18f843a
Merge branch 'feature/scalar_adjointFT' into feature/hadrons
...
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
2017-06-16 17:59:55 +01:00
Lanny91
1bd311ba9c
Faster sequential conserved current implementation, now compatible with 5D vectorisation & G-parity.
2017-06-16 16:43:15 +01:00
Lanny91
41af8c12d7
Code cleaning for conserved current contractions. Will now be easier to implement mobius conserved current.
2017-06-16 16:38:59 +01:00
Lanny91
a833f88c32
Added missing SIMD integer reduction implementation for AVX, AVX-512, SSE4, IMCI
2017-06-16 15:58:47 +01:00
Lanny91
07b2c1b253
Placeholder precision change functions to allow Grid to compile with QPX (warning: no actual functionality)
2017-06-16 15:04:26 +01:00
Lanny91
735cbdb983
QPX Integer reduction (+ integer reduction test)
2017-06-14 10:55:10 +01:00
Lanny91
2ad54c5a02
QPX exchange support
2017-06-14 10:53:39 +01:00
Nils Meyer
3d04dc33c6
ARM neon intrinsics support
2017-06-13 13:26:59 +02:00
paboyle
91199a8ea0
openmpi is not const safe
2017-06-13 12:21:29 +01:00
paboyle
0494feec98
Libz dependency
2017-06-13 12:00:23 +01:00
paboyle
a16b1e134e
gcc 4.9 fix
2017-06-13 10:48:43 +01:00
Lanny91
5633a2db20
Faster implementation of conserved current site contraction. Added 5D vectorised support, but not G-parity.
2017-06-12 10:41:02 +01:00
paboyle
769ad578f5
Odd new error on G++ 49 on travis
2017-06-12 00:41:21 +01:00
paboyle
eaac0044b5
Compile fixes
2017-06-12 00:20:49 +01:00
paboyle
56042f002c
New files
2017-06-11 23:19:20 +01:00
paboyle
3bfd1f13e6
I/O improvements
2017-06-11 23:14:10 +01:00
Azusa Yamaguchi
70ab598c96
Move gfix into utils
2017-06-08 22:22:23 +01:00
Azusa Yamaguchi
1d0ca65e28
Move Gfix into utils
2017-06-08 22:21:50 +01:00
Lanny91
b35fc4e7f9
Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon
...
# Conflicts:
# extras/Hadrons/Global.hpp
# tests/hadrons/Test_hadrons_rarekaon.cc
2017-06-07 14:38:51 +01:00
f6aa82b7f2
Merge branch 'develop' into feature/hadrons
2017-06-06 11:46:33 -05:00
Lanny91
8d442b502d
Sequential current fix for spacial indices.
2017-06-06 17:06:40 +01:00
0503c028be
Merge branch 'feature/qed-fvol' into feature/hadrons (non-trivial conflicts on scalar Impl)
...
# Conflicts:
# configure.ac
# lib/qcd/action/scalar/Scalar.h
2017-06-05 16:37:47 -05:00
Lanny91
622a21bec6
Improvements to sequential conserved current test and small bugfix.
2017-06-05 15:55:32 +01:00
Lanny91
eec79e0a1e
Ward Identity test improvements and conserved current bug fixes
2017-06-05 11:55:41 +01:00
paboyle
092dcd4e04
MPI I/O only if MPI compiled
2017-06-02 22:50:25 +01:00
Guido Cossu
7da4856e8e
Wilson flow with adaptive steps
2017-06-02 16:55:53 +01:00
Guido Cossu
aaf1e33a77
Adding adaptive integration in the WilsonFlow
2017-06-02 16:32:35 +01:00
paboyle
094c3d091a
Improved and RNG's now survive checkpoint
2017-06-02 00:38:58 +01:00
Peter Boyle
1a1f6d55f9
Roll over to MPI IO for parallel IO
2017-06-01 17:37:26 -04:00
Peter Boyle
21421656ab
Big changes improving the code to use MPI IO
2017-06-01 17:36:53 -04:00
Peter Boyle
6f687a67cd
As local vols increase, use 64 bits for safety
2017-06-01 17:36:18 -04:00
paboyle
1e429a0d57
Added MPI version
2017-05-30 23:41:07 +01:00
paboyle
d38a4de36c
Beginning move to MPI IO
2017-05-30 23:40:39 +01:00
paboyle
53a9aeb965
Cosmetic only
2017-05-30 23:39:53 +01:00
paboyle
e30fa9f4b8
RankCount; need to clean up ambigious ProcessCount
2017-05-30 23:39:16 +01:00
paboyle
58e8d0a10d
reverse direction lexico mapping
2017-05-30 23:38:30 +01:00
paboyle
62cf9cf638
Cleaner code
2017-05-30 23:38:02 +01:00
Guido Cossu
7c6cc85df6
Updating WilsonFlow test
2017-05-27 18:03:49 +01:00
Lanny91
23135aa58a
Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon
2017-05-26 16:00:50 +01:00
Guido Cossu
0de314870d
Faster derivative for WilsonGauge
2017-05-26 14:31:49 +01:00
Guido Cossu
f4e8bf2858
Fixing the topological charge. Wilson Flow tested, ok
2017-05-26 12:45:59 +01:00
paboyle
b8b5934193
Attempts to speed up the parallel IO
2017-05-25 13:32:24 +01:00
Guido Cossu
75856f2945
Compilation fix in the Tensor_exp
2017-05-25 12:44:56 +01:00
Guido Cossu
3c112a7a25
Small correction to the general exp definition
2017-05-25 12:09:00 +01:00
Guido Cossu
ab3596d4d3
Using Cayley-Hamilton form for the exponential of SU(3) matrices
2017-05-25 12:07:47 +01:00
paboyle
a8c10b1933
Use a global-X x Local-Y chunksize for parallel binary I/O.
...
Gives O(32 x 8 x 18*8*8) chunk size on configuration I/O.
At 150KB should be getting close to packet sizes and 4MB filesystem
block sizes that are reasonably (!?) performant. We shall see once I move
this off my laptop and over to BNL and time it.
2017-05-25 11:43:33 +01:00
Guido Cossu
15e801af3f
Fixing a compilation error for generic SIMD
2017-05-19 16:39:36 +01:00
Guido Cossu
a8fb2835ca
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-18 14:45:00 +01:00
22f4feee7b
Merge branch 'develop' into feature/scalar_adjointFT
2017-05-17 13:27:13 +02:00
paboyle
3267683e22
Union workaround for g++
2017-05-17 11:26:18 +01:00
Azusa Yamaguchi
f46a67ffb3
No compile issue on clang on mac fixed.
...
Compiler version was clang++-3.9 under mpicxx
2017-05-17 10:51:01 +01:00
Guido Cossu
10f2872aae
Faster exponentiation for lattice fields
2017-05-15 15:51:16 +01:00
35fa3d1dfd
Merge branch 'master' into feature/scalar_adjointFT
2017-05-12 10:41:39 +01:00
paboyle
49a5d9bac7
Clang major, minor trailing underscore
2017-05-11 12:25:02 +01:00
paboyle
8a43e88b4f
Compiler check early in build
2017-05-11 11:43:06 +01:00
paboyle
238df20370
Still working on the compiler compat checks
2017-05-11 11:30:14 +01:00
paboyle
655492a443
Compiler detection
2017-05-11 11:21:11 +01:00
paboyle
1cab06f6bd
Compat checks for compilers
2017-05-11 10:20:24 +01:00
43c817cc67
Scalar action: const fix
2017-05-11 00:07:17 +01:00
Guido Cossu
9c12c37aaf
Confirming the fix on the complex boundary conditions
2017-05-09 08:41:29 +01:00
Guido Cossu
01d0e54594
Merge branch 'release/v0.7.0' into develop
2017-05-08 22:02:51 +01:00
Guido Cossu
5aafa335fe
Fixing JSON error for complex numbers
2017-05-08 21:56:44 +01:00
Guido Cossu
8ba0494485
Fixing JSON for complex numbers
2017-05-08 21:41:39 +01:00
paboyle
529e78d43f
Restart the v0.7.0 release
2017-05-08 18:20:04 +01:00
paboyle
93f6c15772
Warning squash
2017-05-06 16:38:58 +01:00
paboyle
c7cc7e6101
Fix
2017-05-06 16:10:09 +01:00
paboyle
3bae0a2d5c
Drop a gcc warning
2017-05-06 15:51:42 +01:00
paboyle
c1c7566089
GCC bug work around in 5.0 through 6.2 inclusive.
2017-05-06 15:20:25 +01:00
paboyle
2439999ec8
Warning elimination; drop to -O2 on G++ bad versions
2017-05-06 14:44:49 +01:00
paboyle
1d96f662e3
Fixed 4d fermion gparity force. Put strong tests on make check force tests
2017-05-06 00:46:31 +01:00
Guido Cossu
741bc836f6
Exposing support for Ncolours and Ndimensions and JSON input file for the ScalarAction
2017-05-05 17:36:43 +01:00
paboyle
697c0603ce
SITMO I/O for NERSC working now bit repro
2017-05-05 16:54:44 +01:00
paboyle
14bedebb11
Source pointed to
2017-05-05 16:17:27 +01:00
Guido Cossu
8546d01a4c
Merge branch 'develop' into feature/scalar_adjointFT
2017-05-05 15:47:33 +01:00
Guido Cossu
20999c1370
Merge branch 'develop' into feature/hmc_generalise
2017-05-05 12:47:17 +01:00
Lanny91
77e0af9c2e
Compilation fix after merge - conserved current code not yet operational for vectorised 5D or Gparity Impl.
2017-05-05 12:27:50 +01:00
paboyle
43924007db
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-04 19:53:41 +01:00
paboyle
78ef10e60f
Mobius force improvement
2017-05-04 19:53:21 +01:00
Lanny91
ca1077c560
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon
...
# Conflicts:
# lib/qcd/action/fermion/WilsonFermion5D.cc
# tests/hadrons/Test_hadrons_rarekaon.cc
2017-05-04 16:22:33 +01:00
679ae98b14
Merge branch 'feature/better-external-library' into develop
2017-05-04 15:42:12 +01:00
paboyle
90f6bc16bb
No compile clang fix
2017-05-04 12:15:06 +01:00
Peter Boyle
9b5b639546
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-03 20:51:40 -04:00
Peter Boyle
422cdf4979
Some checks
2017-05-03 18:37:38 -04:00
Peter Boyle
38db174f3b
Print statement
2017-05-03 18:25:26 -04:00
ea9aef7baa
New header for standard headers (was an issue with Remez.h and external compilation)
2017-05-02 18:26:11 +01:00
c9e9e8061d
Merge branch 'feature/hadrons' into develop
2017-05-02 18:23:47 +01:00
Guido Cossu
453cf2a1c6
Moving the topological charge outside the HMC related routines
2017-05-02 14:40:12 +01:00
Guido Cossu
de7bbfa5f9
Adding ParameterFile option for the HMC
2017-05-02 12:16:16 +01:00
Guido Cossu
74f451715f
Fix for Mac compilation on the size_t uint64_t types
2017-05-01 15:12:07 +01:00
Guido Cossu
4063238943
Adding HMC test file example for Mobius + smearing
2017-05-01 13:44:00 +01:00
Guido Cossu
3344788fa1
Merge branch 'develop' into feature/hmc_generalise
2017-05-01 12:13:56 +01:00
Guido Cossu
62a64d9108
EO support, wip
2017-05-01 11:06:21 +01:00
Lanny91
51d84ec057
Bugfixes in Wilson 5D sequential conserved current insertion
2017-04-28 16:49:14 +01:00
Guido Cossu
99a73f4287
Correcting the M and Mdag in the clover term
2017-04-28 15:51:05 +01:00
Guido Cossu
5553b8d2b8
Clover term compiles, not tested
2017-04-28 15:23:34 +01:00
Peter Boyle
99220f6531
Fixes and better timing
2017-04-26 17:24:11 -04:00
Lanny91
d2003f24f4
Corrected incorrect usage of ExtractSlice for conserved current code.
2017-04-26 17:25:28 +01:00
Peter Boyle
f8797e1e3e
bug fix. works now and great face performance
2017-04-26 03:14:02 -04:00
Peter Boyle
fd1eb7de13
Clean implementation of the exterior faces listing only those points on the boudary
2017-04-26 02:34:52 -04:00
Peter Boyle
2ce898efa3
Pretty code
2017-04-26 02:34:25 -04:00
Lanny91
44260643f6
First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions.
2017-04-25 18:00:24 +01:00
paboyle
ab66bac4e6
Think I'm getting on top of the reduced cost exterior precomputed list of links
2017-04-25 08:50:26 +01:00
paboyle
56277a11c8
Build a list of whats on the surface
2017-04-24 17:06:15 +01:00
Guido Cossu
752048f410
Merge branch 'develop' into feature/clover
2017-04-24 14:41:20 +01:00
Peter Boyle
5b55867a7a
Slightly cheaper Ext assembly
2017-04-24 05:36:11 -04:00
Peter Boyle
3accb1ef89
Debugged assemply split phase with interior suppression
2017-04-23 19:30:19 -04:00
Peter Boyle
e3d0e31525
Debugged assemply split phase with interior suppression
2017-04-23 19:29:27 -04:00
Peter Boyle
5812eb8a8c
Partially fixed. But the comms-overlap does not work yet.
2017-04-22 18:50:25 -04:00
paboyle
ac58565d0a
Dangerous rewrite of the assembly. If I make a mistake the debug will be painful.
2017-04-22 19:31:04 +01:00
paboyle
3703b718aa
Mark up a table if a given site only receives from itself; including MPI3 splitting info.
2017-04-22 19:28:37 +01:00
paboyle
b722889234
Try a better load balancing loop
2017-04-22 19:27:41 +01:00
paboyle
abba44a837
Hand unrolled for overlapped comms
2017-04-22 17:45:17 +01:00
paboyle
f301be94ce
Fixed
2017-04-22 17:42:31 +01:00
Peter Boyle
1d1b225497
Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).
2017-04-22 09:05:28 -04:00
Peter Boyle
53a785a3dd
Fixing the KNL compile
2017-04-22 08:11:51 -04:00
paboyle
736bf3c866
Major rework of stencil. Half precision and MPI3 now working.
2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188
L1p config bg/q
2017-04-22 11:33:09 +01:00
paboyle
3844bcf800
If no f16c instructions supported must use software half precision conversion.
...
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
e1a2319d01
Simple compressor moved out of cshift into stencil
2017-04-20 13:18:15 +01:00
paboyle
180c732b4c
Move compressors out of Cshift.
...
Slice iterators would help
2017-04-20 13:17:55 +01:00
paboyle
d2312e9874
Drop compressor entirely from Cshift to only Stencil.
2017-04-20 13:16:55 +01:00
paboyle
fc4ab9ccd5
Working half precision comms
2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca
Massive compressor rework to support reduced precision comms
2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5
Type comparison in the traits work
2017-04-18 13:28:04 +01:00
paboyle
557c3fa109
Pretty change
2017-04-18 13:27:38 +01:00
paboyle
8e161152e4
MultiRHS solver improvements with slice operations moved into lattice and sped up.
...
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
3141ebac10
MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled.
2017-04-17 10:50:19 +01:00
paboyle
7ede696126
Non compile of tests fixed
2017-04-16 23:40:00 +01:00
paboyle
bf516c3b81
higher precision reduction variables in norm and inner product
2017-04-15 12:27:28 +01:00
paboyle
441a52ee5d
First cut at higher precision reduction
2017-04-15 10:57:21 +01:00
paboyle
a8db024c92
Cleaning up the dense matrix and lanczos sector
2017-04-15 08:54:11 +01:00
paboyle
3ca41458a3
Fix to no USE_FP16 case
2017-04-14 14:20:54 +01:00
Guido Cossu
b694996302
adding comments
2017-04-14 13:30:14 +01:00
Peter Boyle
951be75292
Half precision conversion working on AVX512 now too
2017-04-13 17:35:11 +01:00
Peter Boyle
b9113ed310
Patches for knl
2017-04-13 12:02:12 -04:00
a6a0da873f
Merge branch 'feature/hadrons' into feature/qed-fvol
2017-04-13 15:31:06 +01:00
paboyle
42fb49d3fd
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-04-13 14:12:47 +01:00
paboyle
db5ea001a3
Update to use Xcode 8.3 since -mfp16 causes SIGILL
2017-04-13 12:22:40 +01:00
paboyle
1d502e4ed6
FP16 optional compile time
2017-04-13 11:55:24 +01:00
paboyle
73cdf0fffe
Drop f16c from SSE because of a macos compile error on travis
2017-04-13 11:23:41 +01:00
paboyle
1c25773319
Trap illegal instructions
2017-04-13 10:51:40 +01:00
paboyle
94eb829d08
Align cast fixed for __mm128i gcc complained
2017-04-13 08:40:44 +01:00
paboyle
68392ddb5b
Exchange in generic
...
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle
cb6b81ae82
Half precision conversion
2017-04-12 19:32:37 +01:00
53e76b41d2
Merge branch 'develop' into feature/hadrons
2017-04-10 17:00:53 +01:00
8ef4300412
spurious .dirstamp files removed
2017-04-10 17:00:22 +01:00
98a24ebf31
The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future.
2017-04-10 16:58:54 +01:00
paboyle
b12dc89d26
Commenting and clean up
2017-04-10 20:38:20 +09:00
paboyle
d80d802f9d
MultiRHS solver test
2017-04-10 00:12:12 +09:00
paboyle
3d99b09dba
Start of blockCG
2017-04-09 23:42:10 +09:00
paboyle
db5f6d3ae3
Verbose fix
2017-04-09 23:41:30 +09:00
paboyle
683550f116
Const args improvement
2017-04-09 23:41:04 +09:00
paboyle
86aaa35294
Christoph needs SchurDiagTwoKappa which is mobius specific.
2017-04-07 11:07:40 +09:00
Guido Cossu
3b8a791e28
Merge branch 'develop' into feature/clover
2017-04-05 16:20:28 +01:00
Guido Cossu
7b03d8d087
Fixing the remaining merge conflicts
2017-04-05 16:17:46 +01:00
Guido Cossu
4b759b8f2a
Merge branch 'feature/hmc_generalise' into feature/scalar_adjointFT
2017-04-05 14:50:28 +01:00
Guido Cossu
8c540333d5
Merge branch 'develop' into feature/hmc_generalise
2017-04-05 14:41:04 +01:00
Guido Cossu
6fd82228bf
Working on the derivative
2017-04-05 10:51:44 +01:00
paboyle
5592f7b8c1
Creation mode better implementation
2017-04-05 02:35:34 +09:00
paboyle
35da4ece0b
UID fix
2017-04-05 02:18:15 +09:00
Guido Cossu
ca6efc685e
Merge branch 'develop' into feature/clover
2017-04-04 10:19:02 +01:00
ff4e54ef80
Merge branch 'develop' into feature/hadrons
2017-04-03 18:56:21 +01:00
paboyle
83f6fab8fa
Big/Small crush test, and fast SITMO rng init, faster but not ideal
...
MT and Ranlux init.
2017-04-02 12:10:51 +09:00
paboyle
9dc7ca4c3b
Sitmo fast init
2017-04-02 00:28:22 +09:00
paboyle
935d82f5b1
sanity checks
2017-04-02 00:27:28 +09:00
paboyle
9cbcdd65d7
No random device seed
2017-04-02 00:26:57 +09:00
paboyle
7e5faa0f34
Multiple RNGs
2017-04-02 00:25:44 +09:00
paboyle
1c4bc7ed38
Debugged staggered conventions
2017-03-31 14:41:48 +09:00
Guido Cossu
b8ae787b5e
Correcting a simple typo
2017-03-30 11:33:15 +01:00
Guido Cossu
fbe2c3b5f9
]Merge branch 'develop' into feature/clover
2017-03-30 11:18:31 +01:00
Guido Cossu
1ed69816b9
First steps for the force term
2017-03-30 11:14:27 +01:00
paboyle
93ea5d9468
Pretty code
2017-03-30 15:00:03 +09:00
paboyle
9fd23faadf
Pretty layout
2017-03-30 13:44:45 +09:00
paboyle
10e4fa0dc8
Template instantiation improvements
2017-03-30 13:44:25 +09:00