paboyle
a8c10b1933
Use a global-X x Local-Y chunksize for parallel binary I/O.
...
Gives O(32 x 8 x 18*8*8) chunk size on configuration I/O.
At 150KB should be getting close to packet sizes and 4MB filesystem
block sizes that are reasonably (!?) performant. We shall see once I move
this off my laptop and over to BNL and time it.
2017-05-25 11:43:33 +01:00
Guido Cossu
15e801af3f
Fixing a compilation error for generic SIMD
2017-05-19 16:39:36 +01:00
Guido Cossu
a8fb2835ca
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-18 14:45:00 +01:00
22f4feee7b
Merge branch 'develop' into feature/scalar_adjointFT
2017-05-17 13:27:13 +02:00
paboyle
3267683e22
Union workaround for g++
2017-05-17 11:26:18 +01:00
Azusa Yamaguchi
f46a67ffb3
No compile issue on clang on mac fixed.
...
Compiler version was clang++-3.9 under mpicxx
2017-05-17 10:51:01 +01:00
Guido Cossu
10f2872aae
Faster exponentiation for lattice fields
2017-05-15 15:51:16 +01:00
35fa3d1dfd
Merge branch 'master' into feature/scalar_adjointFT
2017-05-12 10:41:39 +01:00
paboyle
49a5d9bac7
Clang major, minor trailing underscore
2017-05-11 12:25:02 +01:00
paboyle
8a43e88b4f
Compiler check early in build
2017-05-11 11:43:06 +01:00
paboyle
238df20370
Still working on the compiler compat checks
2017-05-11 11:30:14 +01:00
paboyle
655492a443
Compiler detection
2017-05-11 11:21:11 +01:00
paboyle
1cab06f6bd
Compat checks for compilers
2017-05-11 10:20:24 +01:00
43c817cc67
Scalar action: const fix
2017-05-11 00:07:17 +01:00
Guido Cossu
9c12c37aaf
Confirming the fix on the complex boundary conditions
2017-05-09 08:41:29 +01:00
Guido Cossu
01d0e54594
Merge branch 'release/v0.7.0' into develop
2017-05-08 22:02:51 +01:00
Guido Cossu
5aafa335fe
Fixing JSON error for complex numbers
2017-05-08 21:56:44 +01:00
Guido Cossu
8ba0494485
Fixing JSON for complex numbers
2017-05-08 21:41:39 +01:00
paboyle
529e78d43f
Restart the v0.7.0 release
2017-05-08 18:20:04 +01:00
paboyle
93f6c15772
Warning squash
2017-05-06 16:38:58 +01:00
paboyle
c7cc7e6101
Fix
2017-05-06 16:10:09 +01:00
paboyle
3bae0a2d5c
Drop a gcc warning
2017-05-06 15:51:42 +01:00
paboyle
c1c7566089
GCC bug work around in 5.0 through 6.2 inclusive.
2017-05-06 15:20:25 +01:00
paboyle
2439999ec8
Warning elimination; drop to -O2 on G++ bad versions
2017-05-06 14:44:49 +01:00
paboyle
1d96f662e3
Fixed 4d fermion gparity force. Put strong tests on make check force tests
2017-05-06 00:46:31 +01:00
Guido Cossu
741bc836f6
Exposing support for Ncolours and Ndimensions and JSON input file for the ScalarAction
2017-05-05 17:36:43 +01:00
paboyle
697c0603ce
SITMO I/O for NERSC working now bit repro
2017-05-05 16:54:44 +01:00
paboyle
14bedebb11
Source pointed to
2017-05-05 16:17:27 +01:00
Guido Cossu
8546d01a4c
Merge branch 'develop' into feature/scalar_adjointFT
2017-05-05 15:47:33 +01:00
Guido Cossu
20999c1370
Merge branch 'develop' into feature/hmc_generalise
2017-05-05 12:47:17 +01:00
Lanny91
77e0af9c2e
Compilation fix after merge - conserved current code not yet operational for vectorised 5D or Gparity Impl.
2017-05-05 12:27:50 +01:00
paboyle
43924007db
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-04 19:53:41 +01:00
paboyle
78ef10e60f
Mobius force improvement
2017-05-04 19:53:21 +01:00
Lanny91
ca1077c560
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon
...
# Conflicts:
# lib/qcd/action/fermion/WilsonFermion5D.cc
# tests/hadrons/Test_hadrons_rarekaon.cc
2017-05-04 16:22:33 +01:00
679ae98b14
Merge branch 'feature/better-external-library' into develop
2017-05-04 15:42:12 +01:00
paboyle
90f6bc16bb
No compile clang fix
2017-05-04 12:15:06 +01:00
Peter Boyle
9b5b639546
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-03 20:51:40 -04:00
Peter Boyle
422cdf4979
Some checks
2017-05-03 18:37:38 -04:00
Peter Boyle
38db174f3b
Print statement
2017-05-03 18:25:26 -04:00
ea9aef7baa
New header for standard headers (was an issue with Remez.h and external compilation)
2017-05-02 18:26:11 +01:00
c9e9e8061d
Merge branch 'feature/hadrons' into develop
2017-05-02 18:23:47 +01:00
Guido Cossu
453cf2a1c6
Moving the topological charge outside the HMC related routines
2017-05-02 14:40:12 +01:00
Guido Cossu
de7bbfa5f9
Adding ParameterFile option for the HMC
2017-05-02 12:16:16 +01:00
Guido Cossu
74f451715f
Fix for Mac compilation on the size_t uint64_t types
2017-05-01 15:12:07 +01:00
Guido Cossu
4063238943
Adding HMC test file example for Mobius + smearing
2017-05-01 13:44:00 +01:00
Guido Cossu
3344788fa1
Merge branch 'develop' into feature/hmc_generalise
2017-05-01 12:13:56 +01:00
Guido Cossu
62a64d9108
EO support, wip
2017-05-01 11:06:21 +01:00
Lanny91
51d84ec057
Bugfixes in Wilson 5D sequential conserved current insertion
2017-04-28 16:49:14 +01:00
Guido Cossu
99a73f4287
Correcting the M and Mdag in the clover term
2017-04-28 15:51:05 +01:00
Guido Cossu
5553b8d2b8
Clover term compiles, not tested
2017-04-28 15:23:34 +01:00
Peter Boyle
99220f6531
Fixes and better timing
2017-04-26 17:24:11 -04:00
Lanny91
d2003f24f4
Corrected incorrect usage of ExtractSlice for conserved current code.
2017-04-26 17:25:28 +01:00
Peter Boyle
f8797e1e3e
bug fix. works now and great face performance
2017-04-26 03:14:02 -04:00
Peter Boyle
fd1eb7de13
Clean implementation of the exterior faces listing only those points on the boudary
2017-04-26 02:34:52 -04:00
Peter Boyle
2ce898efa3
Pretty code
2017-04-26 02:34:25 -04:00
Lanny91
44260643f6
First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions.
2017-04-25 18:00:24 +01:00
paboyle
ab66bac4e6
Think I'm getting on top of the reduced cost exterior precomputed list of links
2017-04-25 08:50:26 +01:00
paboyle
56277a11c8
Build a list of whats on the surface
2017-04-24 17:06:15 +01:00
Guido Cossu
752048f410
Merge branch 'develop' into feature/clover
2017-04-24 14:41:20 +01:00
Peter Boyle
5b55867a7a
Slightly cheaper Ext assembly
2017-04-24 05:36:11 -04:00
Peter Boyle
3accb1ef89
Debugged assemply split phase with interior suppression
2017-04-23 19:30:19 -04:00
Peter Boyle
e3d0e31525
Debugged assemply split phase with interior suppression
2017-04-23 19:29:27 -04:00
Peter Boyle
5812eb8a8c
Partially fixed. But the comms-overlap does not work yet.
2017-04-22 18:50:25 -04:00
paboyle
ac58565d0a
Dangerous rewrite of the assembly. If I make a mistake the debug will be painful.
2017-04-22 19:31:04 +01:00
paboyle
3703b718aa
Mark up a table if a given site only receives from itself; including MPI3 splitting info.
2017-04-22 19:28:37 +01:00
paboyle
b722889234
Try a better load balancing loop
2017-04-22 19:27:41 +01:00
paboyle
abba44a837
Hand unrolled for overlapped comms
2017-04-22 17:45:17 +01:00
paboyle
f301be94ce
Fixed
2017-04-22 17:42:31 +01:00
Peter Boyle
1d1b225497
Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).
2017-04-22 09:05:28 -04:00
Peter Boyle
53a785a3dd
Fixing the KNL compile
2017-04-22 08:11:51 -04:00
paboyle
736bf3c866
Major rework of stencil. Half precision and MPI3 now working.
2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188
L1p config bg/q
2017-04-22 11:33:09 +01:00
paboyle
3844bcf800
If no f16c instructions supported must use software half precision conversion.
...
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
e1a2319d01
Simple compressor moved out of cshift into stencil
2017-04-20 13:18:15 +01:00
paboyle
180c732b4c
Move compressors out of Cshift.
...
Slice iterators would help
2017-04-20 13:17:55 +01:00
paboyle
d2312e9874
Drop compressor entirely from Cshift to only Stencil.
2017-04-20 13:16:55 +01:00
paboyle
fc4ab9ccd5
Working half precision comms
2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca
Massive compressor rework to support reduced precision comms
2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5
Type comparison in the traits work
2017-04-18 13:28:04 +01:00
paboyle
557c3fa109
Pretty change
2017-04-18 13:27:38 +01:00
paboyle
8e161152e4
MultiRHS solver improvements with slice operations moved into lattice and sped up.
...
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
3141ebac10
MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled.
2017-04-17 10:50:19 +01:00
paboyle
7ede696126
Non compile of tests fixed
2017-04-16 23:40:00 +01:00
paboyle
bf516c3b81
higher precision reduction variables in norm and inner product
2017-04-15 12:27:28 +01:00
paboyle
441a52ee5d
First cut at higher precision reduction
2017-04-15 10:57:21 +01:00
paboyle
a8db024c92
Cleaning up the dense matrix and lanczos sector
2017-04-15 08:54:11 +01:00
paboyle
3ca41458a3
Fix to no USE_FP16 case
2017-04-14 14:20:54 +01:00
Guido Cossu
b694996302
adding comments
2017-04-14 13:30:14 +01:00
Peter Boyle
951be75292
Half precision conversion working on AVX512 now too
2017-04-13 17:35:11 +01:00
Peter Boyle
b9113ed310
Patches for knl
2017-04-13 12:02:12 -04:00
a6a0da873f
Merge branch 'feature/hadrons' into feature/qed-fvol
2017-04-13 15:31:06 +01:00
paboyle
42fb49d3fd
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-04-13 14:12:47 +01:00
paboyle
db5ea001a3
Update to use Xcode 8.3 since -mfp16 causes SIGILL
2017-04-13 12:22:40 +01:00
paboyle
1d502e4ed6
FP16 optional compile time
2017-04-13 11:55:24 +01:00
paboyle
73cdf0fffe
Drop f16c from SSE because of a macos compile error on travis
2017-04-13 11:23:41 +01:00
paboyle
1c25773319
Trap illegal instructions
2017-04-13 10:51:40 +01:00
paboyle
94eb829d08
Align cast fixed for __mm128i gcc complained
2017-04-13 08:40:44 +01:00
paboyle
68392ddb5b
Exchange in generic
...
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle
cb6b81ae82
Half precision conversion
2017-04-12 19:32:37 +01:00
53e76b41d2
Merge branch 'develop' into feature/hadrons
2017-04-10 17:00:53 +01:00
8ef4300412
spurious .dirstamp files removed
2017-04-10 17:00:22 +01:00
98a24ebf31
The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future.
2017-04-10 16:58:54 +01:00
paboyle
b12dc89d26
Commenting and clean up
2017-04-10 20:38:20 +09:00
paboyle
d80d802f9d
MultiRHS solver test
2017-04-10 00:12:12 +09:00
paboyle
3d99b09dba
Start of blockCG
2017-04-09 23:42:10 +09:00
paboyle
db5f6d3ae3
Verbose fix
2017-04-09 23:41:30 +09:00
paboyle
683550f116
Const args improvement
2017-04-09 23:41:04 +09:00
paboyle
86aaa35294
Christoph needs SchurDiagTwoKappa which is mobius specific.
2017-04-07 11:07:40 +09:00
Guido Cossu
3b8a791e28
Merge branch 'develop' into feature/clover
2017-04-05 16:20:28 +01:00
Guido Cossu
7b03d8d087
Fixing the remaining merge conflicts
2017-04-05 16:17:46 +01:00
Guido Cossu
4b759b8f2a
Merge branch 'feature/hmc_generalise' into feature/scalar_adjointFT
2017-04-05 14:50:28 +01:00
Guido Cossu
8c540333d5
Merge branch 'develop' into feature/hmc_generalise
2017-04-05 14:41:04 +01:00
Guido Cossu
6fd82228bf
Working on the derivative
2017-04-05 10:51:44 +01:00
paboyle
5592f7b8c1
Creation mode better implementation
2017-04-05 02:35:34 +09:00
paboyle
35da4ece0b
UID fix
2017-04-05 02:18:15 +09:00
Guido Cossu
ca6efc685e
Merge branch 'develop' into feature/clover
2017-04-04 10:19:02 +01:00
ff4e54ef80
Merge branch 'develop' into feature/hadrons
2017-04-03 18:56:21 +01:00
paboyle
83f6fab8fa
Big/Small crush test, and fast SITMO rng init, faster but not ideal
...
MT and Ranlux init.
2017-04-02 12:10:51 +09:00
paboyle
9dc7ca4c3b
Sitmo fast init
2017-04-02 00:28:22 +09:00
paboyle
935d82f5b1
sanity checks
2017-04-02 00:27:28 +09:00
paboyle
9cbcdd65d7
No random device seed
2017-04-02 00:26:57 +09:00
paboyle
7e5faa0f34
Multiple RNGs
2017-04-02 00:25:44 +09:00
paboyle
1c4bc7ed38
Debugged staggered conventions
2017-03-31 14:41:48 +09:00
Guido Cossu
b8ae787b5e
Correcting a simple typo
2017-03-30 11:33:15 +01:00
Guido Cossu
fbe2c3b5f9
]Merge branch 'develop' into feature/clover
2017-03-30 11:18:31 +01:00
Guido Cossu
1ed69816b9
First steps for the force term
2017-03-30 11:14:27 +01:00
paboyle
93ea5d9468
Pretty code
2017-03-30 15:00:03 +09:00
paboyle
9fd23faadf
Pretty layout
2017-03-30 13:44:45 +09:00
paboyle
10e4fa0dc8
Template instantiation improvements
2017-03-30 13:44:25 +09:00
paboyle
c4aca1dde4
Conjugate coefficients on adjoint
2017-03-30 13:44:05 +09:00
paboyle
b9e8ea3aaa
conjugate coefficient on the dagger
2017-03-30 13:43:13 +09:00
paboyle
077aa728b9
Fix the ZMobius (I think)
2017-03-30 13:42:09 +09:00
paboyle
a8d83d886e
Macro controls
2017-03-30 13:31:34 +09:00
paboyle
7fd46eeec4
Trailing whitespace removal
2017-03-30 13:31:10 +09:00
paboyle
2b115929dc
Small AVX512 asm ifdef patch
2017-03-29 18:51:23 +09:00
paboyle
417ec56cca
Release candidate
2017-03-29 05:45:33 -04:00
paboyle
756bc25008
Verbose header print by default
2017-03-29 04:44:17 -04:00
paboyle
35695ba57a
Bug fix in MPI3
2017-03-29 04:43:55 -04:00
paboyle
d805867e02
Better init
2017-03-28 13:25:05 -04:00
paboyle
98f9318279
Build on AVX2 and MPI passing with clang++
2017-03-28 23:16:04 +09:00
paboyle
4b17e8eba8
Merge branch 'develop' into feature/bgq-asm
...
Conflicts:
lib/qcd/action/fermion/Fermion.h
lib/qcd/action/fermion/WilsonFermion.cc
lib/util/Init.cc
tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
paboyle
75112a632a
IO improvements to fail on IO error
2017-03-28 02:28:04 -04:00
paboyle
18bde08d1b
Merge branch 'feature/staggering' into develop
2017-03-28 15:25:55 +09:00
Guido Cossu
5e549ebd8b
Adding force terms
2017-03-27 16:43:15 +09:00
Guido Cossu
fff484eca5
Populating Clover fermions methods
2017-03-27 15:12:57 +09:00
Guido Cossu
5fdc05782b
More in the clover fermion class
2017-03-27 10:54:16 +09:00
Guido Cossu
a04eb7df5d
Starting Clover term
2017-03-24 12:43:28 +09:00
Guido Cossu
4c1ea8677e
Small cosmetic changes and vscode gitignore
2017-03-23 14:09:35 +09:00
paboyle
fc93f0b2ec
Save some code for static huge tlb's. It is ifdef'ed out but an interesting root only experiment.
...
No gain from it.
2017-03-21 22:30:29 -04:00
paboyle
8c8473998d
Average over whole cluster the comm time.
2017-03-21 22:29:51 -04:00
Guido Cossu
120fb59978
Adding tests for WilsonFlow classes
2017-03-21 16:11:35 +09:00
Guido Cossu
fd56b3ff38
Merge branch 'develop' into feature/hmc_generalise
2017-03-21 13:33:41 +09:00
Guido Cossu
0ec6829edc
Fixing compilation errors for the WilsonFlow
2017-03-21 13:06:32 +09:00
Guido Cossu
18b7845b7b
Adding WilsonFlow smearing
2017-03-21 11:52:05 +09:00
Guido Cossu
3d0fe15374
Added topological charge measurement
2017-03-17 16:14:57 +09:00
Guido Cossu
91886068fe
Fixed seg fault for observable modules
2017-03-17 13:59:31 +09:00
Guido Cossu
6d1e9e5f92
Small cleanup of the observables
2017-03-17 11:42:55 +09:00
Guido Cossu
b640230b1e
Moving hmc observables in a different directory
2017-03-17 11:40:17 +09:00
paboyle
e7c36771ed
ZMobius prep for asm
2017-03-15 14:23:33 -04:00
Guido Cossu
038b6ee9cd
Fixing JSON compilation error
2017-03-16 01:09:24 +09:00
Guido Cossu
38806343a8
Improving efficiency of the force term
2017-03-15 15:16:16 +09:00
Guido Cossu
831ca4e3bf
Added Scalar action for fields in the adjoint representation
2017-03-14 14:55:18 +09:00
paboyle
8dc57a1e25
Layout change
2017-03-13 11:11:46 +00:00
paboyle
f57bd770b0
Merge branch 'bugfix/dminus' into feature/bgq-asm
2017-03-13 11:11:03 +00:00
paboyle
4ed10a3d06
Merge branch 'develop' into feature/bgq-asm
2017-03-13 11:10:10 +00:00
Chulwoo Jung
33edde245d
Changing Dminus(Dag) to use full vectors to work correctly
2017-03-12 23:02:42 -04:00
paboyle
447c5e6cd7
Z mobius hermiticity correction
2017-03-13 01:30:43 +00:00
paboyle
8b99d80d8c
Merge branch 'bgq-asm-shmemfixes' into feature/bgq-asm
2017-03-12 23:30:09 +00:00
Guido Cossu
b3dede4dd3
Merge branch 'develop' into feature/hmc_generalise
2017-03-10 23:57:37 +09:00
Guido Cossu
4e34132f4d
Correcting modules use in test files
2017-03-10 23:54:53 +09:00
Guido Cossu
c07cb10247
Merge branch 'feature/hmc_generalise' of https://github.com/paboyle/Grid into feature/hmc_generalise
2017-03-10 22:37:25 +09:00
Guido Cossu
d7767a2a62
Few more tests
2017-03-10 22:33:48 +09:00
Guido Cossu
ec035983fd
Fixing the implicit integration
2017-03-01 11:56:35 +00:00
paboyle
af230a1fb8
Average the time across the whole machine for outliers
2017-02-28 17:05:22 -05:00
Christopher Kelly
06a132e3f9
Fixes to SHMEM comms
2017-02-28 13:31:54 -08:00
Guido Cossu
596dcd85b2
Auxiliary fields
2017-02-27 13:16:38 +00:00
paboyle
96d44d5c55
Header fix
2017-02-24 19:12:11 -05:00
Guido Cossu
7270c6a150
Integrator works now
2017-02-24 17:03:42 +00:00
Lanny91
7fe797daf8
SIMD vector length sanity checks
2017-02-23 16:49:44 +00:00
Lanny91
486a01294a
Corrected QPX SIMD width
2017-02-23 16:47:56 +00:00
paboyle
586a7c90b7
Merge branch 'develop' into feature/bgq-asm
2017-02-23 00:26:59 +00:00
paboyle
e099dcdae7
Merge branch 'develop' into feature/bgq-asm
2017-02-23 00:25:29 +00:00
paboyle
4e7ab3166f
Refactoring header layout
2017-02-22 18:09:33 +00:00
paboyle
aac80cbb44
Bug fix from Chris K
2017-02-22 12:19:09 -05:00
Lanny91
c80948411b
Added tRotate function and MaddRealPart struct for generic SIMD, bugfix in MultRealPart and minor cosmetic changes.
2017-02-22 14:57:10 +00:00
Lanny91
95625a7bd1
Use Grid Integer type
2017-02-22 13:09:32 +00:00
Lanny91
0796696733
Emulated integer vector type for QPX and generic SIMD instruction sets.
2017-02-22 12:01:36 +00:00
azusayamaguchi
1c30e9a961
Verified
2017-02-21 23:01:25 +00:00
Francesco Sanfilippo
93cc270016
making public same serializable parameters in HMC Module
...
RNGModuleParameters
GridModuleParameters
2017-02-21 23:11:56 +01:00
Francesco Sanfilippo
15e668eef1
now it is possible to pass {coords list} to a peek or poke
2017-02-21 22:48:38 +01:00
azusayamaguchi
bf7e3f20d4
Staggaered fermion optimised version
2017-02-21 14:35:42 +00:00
Guido Cossu
902afcfbaf
Adding metric and the implicit steps
2017-02-21 11:30:57 +00:00
paboyle
3ae92fa2e6
Global changes to parallel_for structure.
...
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
paboyle
3906cd2149
Stencil fix on BNL KNL system
2017-02-20 17:51:31 -05:00
paboyle
661fc4d3d1
Debug AVX512 exchange code paths
2017-02-20 17:48:36 -05:00
paboyle
41009cc142
Move excange into the stencil only; keep Cshift fully general
2017-02-20 17:48:04 -05:00
paboyle
37720c4db7
Count bytes off node only
2017-02-20 17:47:40 -05:00
Guido Cossu
97a6b61551
Covariant laplacian and implicit integration
2017-02-20 11:17:27 +00:00
paboyle
cd0da81196
Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm
2017-02-16 18:52:30 -05:00
paboyle
f246fe3304
Improvements to avx for invertible to avoid latent bug
2017-02-16 23:52:44 +00:00
paboyle
8a29c16bde
Faster gather exchange
2017-02-16 23:52:22 +00:00
paboyle
d68907fc3e
Debug temp
2017-02-16 18:51:35 -05:00
paboyle
5c0adf7bf2
Make clang happy with parenthesis
2017-02-16 23:51:33 +00:00
paboyle
be3a8249c6
Faster gather
2017-02-16 23:51:15 +00:00
paboyle
bd600702cf
Vectorise the XYZT face gathering better.
...
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
2017-02-15 11:11:04 +00:00
Guido Cossu
bafb101e4f
Testing different versions of the Laplacian
2017-02-13 15:38:11 +00:00
Guido Cossu
08fdf05528
Added and tested the covariant laplacian + CG solver
2017-02-13 15:05:01 +00:00
paboyle
aca7a3ef0a
Optimisation control improvements
2017-02-10 18:22:31 -05:00
Guido Cossu
c3d7ec65fa
All tests compile.
2017-02-10 10:27:51 +00:00
Guido Cossu
e0571c872b
Merge branch 'develop' into feature/hmc_generalise
2017-02-09 16:12:00 +00:00
Guido Cossu
84687ccf1f
Handling an Intel compiler warning for Json class
2017-02-09 15:33:33 +00:00
Guido Cossu
3274561cf8
Cleanup
2017-02-09 15:18:38 +00:00
paboyle
2c246551d0
Overlap comms and compute options in wilson kernels
2017-02-07 01:37:10 -05:00
paboyle
71ac2e7940
Faster RNG init
2017-02-07 01:33:23 -05:00
paboyle
a48ee6f0f2
Don't use MPI3_leader any more. No real gain and complex
2017-02-07 01:31:24 -05:00
paboyle
73547cca66
MPI3 working i think
2017-02-07 01:30:02 -05:00
paboyle
123c673db7
Policy to control async or sync SendRecv
2017-02-07 01:24:54 -05:00
paboyle
61f82216e2
Communicator Policy, NodeCount distinct from Rank count
2017-02-07 01:22:53 -05:00
paboyle
8e7ca92278
Debugged cshift case
2017-02-07 01:21:32 -05:00
paboyle
485ad6fde0
Stencil working in SHM MPI3
2017-02-07 01:20:39 -05:00
paboyle
6ea2184e18
OMP define change
2017-02-07 01:17:16 -05:00
paboyle
fdc170b8a3
Parallel fors in lattice transfer
2017-02-07 01:16:39 -05:00
paboyle
85c7bc4321
Bug fixes for cases that physics code couldn't hit but latent
...
and discovered on KNL (long vector, y SIMD dir) and checker dir set to y.
Remove the assertions on these code paths now they are tested.
2017-02-07 01:01:15 -05:00
paboyle
0883d6a7ce
Overlap comms compute support; make reg naming consistent with bgq aasm
2017-02-07 00:59:32 -05:00
paboyle
b5e9c900a4
Better printing and signal handling options
2017-02-07 00:57:55 -05:00
paboyle
4bbdfb434c
Overlap comms compute modifications
2017-02-07 00:57:01 -05:00
Lanny91
b7cd1a19e3
Utilities for reading and writing "pair" objects.
2017-02-06 14:08:59 +00:00
Christopher Kelly
c94133af49
Added iteration reporting to CG and mixed CG
...
Added ability to manually change the initial CG inner tolerance in mixed CG
Added .hpp files to filelist script
2017-02-02 17:04:42 -05:00
eedcaf6470
Merge branch 'feature/hadrons' into feature/qed-fvol
2017-02-01 15:53:10 -08:00
e7d8030a64
operator>> for serialisable enums
2017-02-01 15:51:08 -08:00
d775fbb2f9
Gammas: code cleaning and gamma_L implementation & test
2017-02-01 15:45:05 -08:00
863855f46f
header fix
2017-02-01 11:59:44 -08:00
419af7610d
New gamma matrices tidying: generated code is confined to Gamma.* for readability
2017-02-01 11:23:12 -08:00
1140573027
Gamma adj fix: now in Grid namespace to avoid collisions
2017-01-30 10:53:04 -08:00
a0cfbb6e88
Merge branch 'feature/gammas' into feature/hadrons
...
# Conflicts:
# .gitignore
# lib/qcd/spin/Dirac.cc
# scripts/filelist
2017-01-30 09:10:49 -08:00
515a26b3c6
gammas: copyright update
2017-01-30 09:07:09 -08:00
Guido Cossu
16be6d378c
Now action factory support different Fields (templated)
2017-01-30 14:22:41 +00:00
Guido Cossu
f05d0565aa
Adding ScalarField theory
2017-01-30 10:59:28 +00:00
28d99b5297
Merge branch 'develop' into feature/qed-fvol
2017-01-27 16:59:53 -08:00
Guido Cossu
899e685627
Merge branch 'feature/sitmo_rng' into develop
2017-01-27 14:15:56 +00:00
Guido Cossu
6929a84c70
Reformatting files
2017-01-27 11:54:44 +00:00
Guido Cossu
5c779a789b
Moving registrations in an independent file
2017-01-27 11:23:51 +00:00
fad743fbb1
Build system sanity check: corrected several headers not in the <Grid/*> format
2017-01-26 17:00:41 -08:00
Guido Cossu
e863a948e3
Cleaning up files and directories
2017-01-26 15:24:49 +00:00
Guido Cossu
7996f06335
Commented out registrations.
...
Move to an independent file that is linked only for the factory managed HMC
2017-01-25 18:27:45 +00:00
Guido Cossu
ef8d3831eb
Temporary patch the threading error in InsertSlice and ExtractSlice
...
Find source and fix the error
2017-01-25 18:12:04 +00:00
Guido Cossu
70ed9fc40c
Updating the engine to the last version
2017-01-25 18:10:41 +00:00
Guido Cossu
7b40a3e3e5
Reorganizing files
2017-01-25 18:09:46 +00:00
Guido Cossu
677757cfeb
Added and tested SITMO PRNG
2017-01-25 12:47:22 +00:00
Guido Cossu
f7fbbaaca3
Compiles after merging
2017-01-25 12:11:58 +00:00
Guido Cossu
17629b8d9e
Merge branch 'develop' into feature/hmc_generalise
2017-01-25 11:33:53 +00:00
Guido Cossu
0baa20d292
Againg fixing compilation on Travis, no LIME lib present
2017-01-25 11:18:44 +00:00
Guido Cossu
4571c918a4
Fixing compilation error when compiling without LIME
2017-01-25 11:14:43 +00:00
Guido Cossu
5251ea4d30
Adding more fermion action modules, generalised DWF
2017-01-25 11:10:44 +00:00
05cb6d318a
gammas: adjoint implemented as a symbolic operation
2017-01-24 18:07:43 -08:00
0432e30256
Gamma right multiply code fix (now passes consistency check)
2017-01-24 17:36:23 -08:00
f7db342f49
Serialisable enums can be converted to int
2017-01-24 17:33:26 -08:00
Guido Cossu
7f456b4173
👷 Added all pseudofermion actions to the serialiser
2017-01-24 13:57:32 +00:00
a37e71f362
New automatic implementation of gamma matrices, Meson and SeqGamma are broken
2017-01-23 19:13:43 -08:00
Guido Cossu
244f8fb6dc
Added JSON parser (without NextElement)
2017-01-23 14:57:38 +00:00
37988221a8
Merge branch 'feature/serialisation-hdf5' into feature/qed-fvol
2017-01-20 14:04:20 -08:00
4c75095c61
HDF5: header fix
2017-01-20 12:14:01 -08:00
afa095d33d
HDF5: better complex number support
2017-01-20 12:10:41 -08:00
6b5259cc10
HDF5 detects if a name is a dataset or not without using exception catching
2017-01-20 11:03:19 -08:00
Guido Cossu
27dfe816fa
Added TwoFlavorsEO
...
Had to remove a conformability check in the Derivative of SchurDiff,
see the comments in the file
2017-01-20 16:59:31 +00:00
Guido Cossu
f96fac0aee
All functionalities ready.
...
Todo: add all the fermion action modules
2017-01-20 12:56:20 +00:00
7423a352c5
HDF5: typos
2017-01-19 18:33:04 -08:00
81e66d6631
HDF5: revert back to native types
2017-01-19 18:24:53 -08:00
ade1058e5f
Hdf5Type does not need to be a pointer anymore
2017-01-19 18:23:55 -08:00
6eea9e4da7
HDF5 types static initialisation is mysteriously buggy on BG/Q, changing strategy
2017-01-19 18:02:53 -08:00
2c673666da
Standardisation of HDF5 types
2017-01-19 17:19:12 -08:00
7a327a3f28
Merge branch 'develop' into feature/qed-fvol
2017-01-19 14:22:36 -08:00
Guido Cossu
851f2ad8ef
Adding fermions actions support in the factories
2017-01-19 10:00:02 +00:00
5405526424
Code typo
2017-01-18 22:42:19 -08:00
654e0b0fd0
Serialisable object are now comparable with ==
2017-01-18 17:40:32 -08:00
4be08ebccc
debug code cleaning
2017-01-18 17:39:59 -08:00
f599cb5b17
HDF5 serial IO implemented and tested
2017-01-18 16:50:21 -08:00
Guido Cossu
23e0561dd6
Added all required functionalities, time for cleaning
...
All actions to be added
2017-01-18 16:31:51 +00:00
5803933aea
First implementation of HDF5 serial IO writer, reader is still empty
2017-01-17 16:21:18 -08:00
Guido Cossu
924130833e
Moved more parameters to serialization
2017-01-17 13:22:18 +00:00
Guido Cossu
0157274762
HMC factories
2017-01-17 10:46:49 +00:00
Guido Cossu
87e8aad5a0
Added support for input file HMC modules (missing the actions yet)
2017-01-16 16:07:12 +00:00
Guido Cossu
c6f59c2933
Adding factories
2017-01-16 10:18:09 +00:00
91a3534054
Lattice slice utilities now thread safe
2017-01-16 06:32:25 +00:00
Guido Cossu
0dfda4bb90
Working on the RNGModule
2017-01-09 11:06:18 +00:00
Guido Cossu
1189ebc8b5
Cleaning up the checkpointers interface
2017-01-05 15:52:52 +00:00
82b3f54697
scalar free propagator fix
2017-01-05 14:58:07 +00:00
Guido Cossu
1bb8578173
Added module for checkpointers
2017-01-05 13:09:32 +00:00
Peter Boyle
c3b6d573b9
Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm
2016-12-30 22:42:17 +00:00
afbf7d4c37
QED Gimpl moved in Photon.h
2016-12-29 22:43:38 +01:00
8c3cc32364
Scalar action
2016-12-29 22:42:58 +01:00
Peter Boyle
1e179c903d
Worried about integer; suspect where statements are broken
2016-12-27 17:46:38 +00:00
Peter Boyle
669cfca9b7
No inline
2016-12-27 17:45:40 +00:00
Peter Boyle
ff2f559a57
Remove inline on gather optimised path
2016-12-27 17:45:19 +00:00
Peter Boyle
03c81bd902
Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm
2016-12-27 11:25:35 +00:00
Peter Boyle
a869addef1
Stats switch off
2016-12-27 11:25:22 +00:00
Peter Boyle
1caa3fbc2d
LOCK UNLOCK only
2016-12-27 11:24:45 +00:00
Peter Boyle
3d21297bbb
Call the fast path compressor for wilson kernels to avoid if else on projector
2016-12-27 11:23:13 +00:00
Peter Boyle
25efefc5b4
Back to original thread policy post test
2016-12-23 09:49:04 +00:00
Peter Boyle
eabf316ed9
BGQ performance ASM
2016-12-22 21:56:08 +00:00
Peter Boyle
04ae7929a3
BGQ or KNL assembler now
2016-12-22 17:53:22 +00:00
Peter Boyle
caba0d42a5
L1p controls
2016-12-22 17:52:55 +00:00
Peter Boyle
9ae81c06d2
L1p controls for BG/Q
2016-12-22 17:52:21 +00:00
Peter Boyle
7dc36628a1
QPX finishing
2016-12-22 17:50:48 +00:00
Peter Boyle
b8cdb3e90a
Debug hack; raises from 62GF/s to 72 GF/s per node on BG/Q
2016-12-22 17:50:14 +00:00
Peter Boyle
5241245534
Default to static scheduling
2016-12-22 17:49:21 +00:00
Dr Peter Boyle
960316e207
type conversion in printf
2016-12-22 17:27:01 +00:00
Guido Cossu
5214846341
Adding a resource manager
2016-12-22 12:41:56 +00:00
17b3a10d46
stochastic QED: function to cache 1/sqrt(khat^2)
2016-12-22 00:29:19 +01:00
Guido Cossu
ce1a115e0b
Removing redundant arguments for integrator functions, step 1
2016-12-20 17:51:30 +00:00
9ac3ac41df
serialisable Photon parameters
2016-12-20 12:41:01 +01:00
6f1ea96293
Merge branch 'develop' into feature/qed-fvol
2016-12-20 12:33:02 +01:00
f8d11ff673
better serialisable enums (can be encapsulated into classes)
2016-12-20 12:31:49 +01:00
paboyle
3f2d53a994
BGQ assembler beginning
2016-12-20 10:21:26 +00:00
paboyle
a59f5374d7
Evade warning
2016-12-18 02:23:55 +00:00
paboyle
4b220972ac
Warning fix
2016-12-18 02:14:17 +00:00
paboyle
629f43e36c
Return statement needed
2016-12-18 02:09:37 +00:00
paboyle
a3172b3455
Precision error
2016-12-18 02:07:45 +00:00
paboyle
3e6945cd65
Fixing AVX Z-mobius
2016-12-18 02:05:11 +00:00
paboyle
87be03006a
AVX 512 code broke other compiles; fixing
2016-12-18 01:45:09 +00:00
paboyle
f17436fec2
Bad commit fixed
2016-12-18 01:27:34 +00:00
Peter Boyle
4d8b01b7ed
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-12-18 00:56:57 +00:00
Peter Boyle
fa6acccf55
Zmobius asm
2016-12-18 00:56:19 +00:00
azusayamaguchi
df9108154d
Debugged 2 versions of assembler; ls vectorised, xyzt vectorised
2016-12-17 23:47:51 +00:00
azusayamaguchi
b3e7f600da
Partial implementation of 4d vectorisation assembler
2016-12-16 23:50:30 +00:00
azusayamaguchi
d4071daf2a
Template specialise
2016-12-16 22:28:29 +00:00
azusayamaguchi
a2a6329094
AVX512 only for ASM compilation
2016-12-16 22:03:29 +00:00
azusayamaguchi
eabc577940
Assembler possibly working
2016-12-16 16:55:36 +00:00
2e3c5890b6
qed-fvol: build fix
2016-12-15 20:06:46 +00:00
bc6678732f
Merge branch 'feature/hadrons' into feature/qed-fvol
...
# Conflicts:
# Makefile.am
# configure.ac
# lib/qcd/action/gauge/Photon.h
2016-12-15 19:53:00 +00:00
91e98b1dd5
Merge branch 'feature/hadrons' into develop
2016-12-15 18:15:56 +00:00
b791c274b0
Revert "AVX: uninitialised variable fix"
...
This reverts commit c22c3db9ad
.
2016-12-15 18:15:35 +00:00
Guido Cossu
0bd296dda4
Adding check of the Dag part in the benchmark
2016-12-14 03:15:09 +00:00
c22c3db9ad
AVX: uninitialised variable fix
2016-12-13 19:05:58 +00:00
Guido Cossu
2fb92dbc6e
Cleaning up previous debug lines
2016-12-13 07:53:43 +00:00
Guido Cossu
5c74b6028b
Commit for debugging, lot of IO
2016-12-13 06:35:30 +00:00
Guido Cossu
ef72f322d2
consistency of tests
2016-12-13 02:24:20 +00:00
Azusa Yamaguchi
426197e446
Nc=3
2016-12-12 09:10:54 +00:00
Azusa Yamaguchi
99e2c1e666
Kernels options
2016-12-12 09:08:53 +00:00
Azusa Yamaguchi
1440565a10
Decrease verbosity
2016-12-12 09:08:04 +00:00
Azusa Yamaguchi
e9f0c0ea39
Staggered kernels options
2016-12-12 09:07:38 +00:00
Peter Boyle
fe187e9ed3
Compiles and passes under ZMobius with assembler
2016-12-10 00:47:48 +00:00
Peter Boyle
0091b50f49
Zmobius working -- not asm yet
2016-12-09 22:51:32 +00:00
Peter Boyle
fb8d4b2357
Lots of debug on performance Mobius
2016-12-08 17:28:28 +00:00
Peter Boyle
83fa038bdf
Streaming stores
2016-12-08 16:58:42 +00:00
Peter Boyle
7a61feb6d3
Allocator added with caching for Linux VM subsystem optimisation
2016-12-08 16:58:01 +00:00
Peter Boyle
69ae817d1c
Updates for supporting Mobius better
2016-12-08 16:43:28 +00:00
Guido Cossu
2bd4233919
Completed testing of the HMC for Ls vectorised version (on AVX2)
2016-12-07 04:56:37 +00:00
Guido Cossu
143c70e29f
Debugged the threaded version. Cleaning up
2016-12-07 04:40:25 +00:00
51322da6f8
Hadrons: genetic scheduler improvement
2016-12-07 09:00:45 +09:00
c56707e003
useless debug message removed
2016-12-07 08:59:20 +09:00
Guido Cossu
b812d5e39c
Added single threaded version of the derivative for the Ls vectorised DWF
2016-12-06 16:31:13 +00:00
Guido Cossu
01480da0a8
Merge branch 'develop' into feature/hmc_generalise
2016-12-05 05:10:27 +00:00
Peter Boyle
e27c6b217c
Updating
2016-12-01 12:42:53 +00:00
9ad3d3453e
Hadrons is now a library, the previous XML driven program is now a test
2016-12-01 21:36:29 +09:00
paboyle
6adf35da54
Faster Mobius
2016-12-01 11:39:04 +00:00
paboyle
bd0430b34f
Serialisation in malloc fixed
2016-11-29 22:27:55 +00:00
Azusa Yamaguchi
c097fd041a
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering
2016-11-29 13:44:17 +00:00
Azusa Yamaguchi
77fb25fb29
Push 5d tests
2016-11-29 13:43:56 +00:00
Azusa Yamaguchi
389e0a77bd
Staggerd Fermion 5D
2016-11-29 13:13:56 +00:00
paboyle
4704f2d009
Actions updated
2016-11-29 00:14:36 +00:00
Guido Cossu
ae9688e343
Reporting also the total mflops
2016-11-28 11:37:02 +00:00
43928846f2
first steps to make Hadrons a library
2016-11-28 16:02:15 +09:00
fabcd4179d
Hadrons: propagator type coming from the fermion implementation
2016-11-28 14:02:10 +09:00
a8843c9af6
Code cleaning, the fermion implementation can be sepcified using the macro FIMPL
2016-11-27 16:47:22 +09:00
7a1a7a685e
Merge branch 'feature/fft-opt' into feature/hadrons
2016-11-27 15:32:03 +09:00
Lanny91
b18950f776
Added simd real divide test with QPX divide fixes
2016-11-25 13:21:33 +00:00
Lanny91
0acbf77bc6
Add QPX Div structure
2016-11-24 13:24:12 +00:00
5833f247fa
more FFt optimisations
2016-11-24 09:09:48 +09:00
Azusa Yamaguchi
95f43d27ae
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering
2016-11-22 13:49:22 +00:00
Azusa Yamaguchi
668ca57702
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering
2016-11-22 13:49:11 +00:00
a2cffb0304
AVXFMA target fixed
2016-11-21 17:47:18 +01:00
97cddda49e
Merge branch 'feature/gen-simd' into feature/doxygen
...
# Conflicts:
# Makefile.am
# configure.ac
2016-11-19 13:11:13 +01:00
b873504b90
fully generic SIMD
2016-11-19 01:32:39 +01:00
Guido Cossu
62749d05a6
Naming the scalar action
2016-11-17 12:26:20 +00:00
Guido Cossu
3834feb4b7
Adding action names
2016-11-16 16:46:49 +00:00
James Harrison
6b8ee7bae0
Merge branch 'feature/feynman-rules' into feature/qed-fvol
2016-11-15 13:08:08 +00:00
James Harrison
739c2308b5
Set imaginary part of stochastic QED field to zero using real() instead of conjugate().
2016-11-15 13:07:52 +00:00
042ae5b87c
generic 256bits SIMD
2016-11-15 12:16:15 +00:00
James Harrison
d49e502f53
Merge branch 'feature/feynman-rules' into feature/qed-fvol
2016-11-14 18:00:33 +00:00
James Harrison
92ec3404f8
Set imaginary part of stochastic QED field to zero after FFT into position space
2016-11-14 17:59:02 +00:00
Guido Cossu
a783282b8b
Merge branch 'develop' into feature/hmc_generalise
2016-11-10 18:13:07 +00:00
paboyle
604f0ea2f6
Merge branch 'develop' into release/v0.6.0
2016-11-09 04:13:01 -08:00
paboyle
33dc1f51b5
Final sign off commits from Cori-1
2016-11-09 04:11:03 -08:00
James Harrison
c30d96ea50
QedFVol: x86intrin.h namespace fix
2016-11-09 11:06:20 +00:00
13a8997789
Merge branch 'release/v0.6.0' into feature/hadrons
...
# Conflicts:
# Makefile.am
2016-11-08 20:43:39 +00:00
9576f0903d
namespace fix
2016-11-08 19:07:47 +00:00
8a5e3a917c
Merge branch 'develop' into release/v0.6.0
...
# Conflicts:
# tests/core/Test_fft_gfix.cc
2016-11-08 16:53:42 +00:00
3d2a22a14d
include fix for MKL
2016-11-08 15:31:47 +00:00
azusayamaguchi
f85b35314d
Fix a routine for single node processor coor from rank
2016-11-08 11:49:13 +00:00
azusayamaguchi
0cff8754d1
Usecs
2016-11-08 11:35:41 +00:00
azusayamaguchi
692b44dac1
Merge branch 'develop' into release/v0.6.0
2016-11-04 22:48:11 +00:00
azusayamaguchi
96ba42a297
omm buf
2016-11-04 22:47:25 +00:00
azusayamaguchi
f7b60004f3
Merge branch 'develop' into release/v0.6.0
2016-11-04 16:08:07 +00:00
ad971ca07b
fftw3.h is now expected to be an external header
2016-11-04 13:12:35 +00:00
f2f16eb972
fftw3.h removed, please don't commit this file back
2016-11-04 13:11:05 +00:00
azusayamaguchi
b7d55f7dfb
Fix a typo in reorg of the --dslash-asm
2016-11-04 11:35:08 +00:00
azusayamaguchi
6e548a8ad5
Linux compile needed
2016-11-04 11:34:16 +00:00
Azusa Yamaguchi
ee686a7d85
Compiles now
2016-11-03 16:58:23 +00:00
Azusa Yamaguchi
1c5b7a6be5
Staggered phases first cut, c1, c2, u0
2016-11-03 16:26:56 +00:00