1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 09:45:36 +00:00
Commit Graph

1682 Commits

Author SHA1 Message Date
Guido Cossu
147e2025b9 Added unit tests on the representation transformations
Status: Passing all tests
2016-08-08 16:54:22 +01:00
b1cfb4d661 first try at a nicer Doxygen implementation 2016-08-05 15:29:18 +01:00
paboyle
32bc7a6ab8 MPI back out of change that hangs
AVX2 for clang, gcc needs the -mfma flag.
2016-08-05 10:36:00 +01:00
7ff7c7d90d Merge branch 'develop' into feature/hadrons 2016-08-04 16:22:10 +01:00
93d29bb699 build system improvements after discussion with Peter 2016-08-04 16:19:59 +01:00
2485ef9c9c Merge branch 'feature/new-build' into feature/hadrons
# Conflicts:
#	Makefile.am
#	scripts/copyright
2016-08-03 16:49:16 +01:00
9e5b934d21 improved LAPACK configuration 2016-08-02 17:26:54 +01:00
Guido Cossu
49b5c49851 Checked the hermiticity of the op in derivative, ok
Still CG fails to converge
2016-07-31 12:37:33 +01:00
e9f30cab2c first working version for the new build system 2016-07-30 17:53:18 +01:00
Guido Cossu
089f0ab582 Debugged HMC for Creutz relation 2016-07-28 16:44:41 +01:00
Guido Cossu
b93e18ed50 Modified the Dirac Kernel class to compile with different number of colours
Added the general push_back functionality to accomodate for all defined representations

Compiles, not tested
2016-07-18 16:36:28 +01:00
Guido Cossu
9c77bb69a5 Added all elements for Hirep HMC
TODO: Test and debug
2016-07-18 12:05:23 +01:00
paboyle
f9e90eeb1f Sign error on the force for 4d fields fixed 2016-07-16 01:52:44 +01:00
paboyle
fad5c675eb sign error on the 4d gparity force 2016-07-16 01:51:56 +01:00
paboyle
4908b77d46 Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created
a HUGE amount of difficult to resolve and understand conflicts .

Wholesale formatting, reordering functions etc... in a central file like Tensor_class
or Grid_vector_types while others are also editing without making substantial functionality
changes creates pain.
2016-07-15 20:59:07 +01:00
paboyle
f4dd5062d7 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-07-15 19:26:06 +01:00
paboyle
980ff18956 Solving the instantiation no compile issue 2016-07-15 17:19:44 +01:00
Guido Cossu
7edf4c6c04 Added HMC utitities for the higher representations
TODO: Inherit types for the pseudofermions, Debugging, testing
2016-07-15 13:39:47 +01:00
paboyle
1a6c7204ac Disable instantiation; Use cache version instead 2016-07-15 00:34:39 +01:00
paboyle
49310fbab3 Done with red black change over 2016-07-15 00:08:43 +01:00
paboyle
5c0c8efb9e Updated file list 2016-07-15 00:02:11 +01:00
paboyle
dfd714e1ef Multiple implementations for the 5d hopping terms, depending on cache friendly
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62 Rewrite for performance. Impl dependent instantiations give
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above

-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
   and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2 5d ls vec rename of impl class 2016-07-14 23:57:26 +01:00
paboyle
a307274c96 Fermion impl rename for ls vectorised 5d approaches 2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe Updating the class to 5d selection based on impl type 2016-07-14 23:55:26 +01:00
paboyle
48fb1cdc11 Update domain 5d vectorised impl type, move the type over to 4d redblack with
the dense OO inverse
2016-07-14 23:54:35 +01:00
paboyle
8a79e93cc2 Rename the 5d domain wall fermion vectorised Ls impl class 2016-07-14 23:53:00 +01:00
paboyle
dd62a61c5c Added broadcast and rotation of simd vectors 2016-07-14 23:49:00 +01:00
paboyle
8f47d0b5ab Rotation needed for hopping term in fifth dim with Ls vectorised fields 2016-07-14 23:45:36 +01:00
paboyle
42af132dab Fix for chris kellys request to peek poke on checkerboarded fields 2016-07-14 23:44:48 +01:00
paboyle
adbc7c1188 Adding files for multiple implementations (cache opt) and Ls vectorisation
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.

The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.

This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.

Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
Guido Cossu
9dc345e8e8 Debugged smearing and adding HMC functions for hirep 2016-07-13 17:51:18 +01:00
Christopher Kelly
6f47fbb1e2 Disabled parallel for loops in ExtractSlice and InsertSlice due to race conditions. Likely will need to do so for localConvert too. 2016-07-13 10:49:18 -04:00
Guido Cossu
a9ae30f868 Added representations definitions for the HMC 2016-07-12 13:36:10 +01:00
Christopher Kelly
a3c0fb79b6 Fix to iVector and iMatrix pokeIndex and checkerboard local site indexing. 2016-07-11 17:15:22 -04:00
paboyle
62601bb649 Bug fix 2016-07-08 20:46:29 +01:00
paboyle
ef97e32152 Adding persistent communicators 2016-07-08 17:16:08 +01:00
Guido Cossu
daea5297ee Wrote the projector in the adjoint representation algebra 2016-07-08 16:14:16 +01:00
Guido Cossu
5028969d4b Added generators for the adjoint representation 2016-07-08 15:40:11 +01:00
paboyle
a0676beeb1 Open up dependency on Eigen and FFTW 2016-07-07 22:31:07 +01:00
Christopher Kelly
c5106d0c03 Bugfix 2016-07-07 16:06:30 -04:00
Guido Cossu
fbf96b1bbb ]Merge branch 'develop' into feature/hirep 2016-07-07 14:20:10 +01:00
Guido Cossu
3c49ddfaa4 Merge branch 'temporary-smearing' into develop 2016-07-07 14:04:59 +01:00
Guido Cossu
ffb8b3116c Tested smeared RHMC Wilson1p1, accepting 2016-07-07 11:49:36 +01:00
Christopher Kelly
dd8cfff111 Another fix for pedantic compilers 2016-07-06 18:22:15 -04:00
Christopher Kelly
184642adb0 Fix for pedantic compilers 2016-07-06 18:15:15 -04:00
Christopher Kelly
4774a3bcd2 Generalized HotConfiguration and functions it calls to accept gauge fields with precision other than the default. 2016-07-06 18:01:08 -04:00
Christopher Kelly
25fafa9a89 Comment 2016-07-06 16:19:41 -04:00
Christopher Kelly
85ed8175cb Implemented mixed precision CG. Fixed filelist to exclude lib/Old directory and include Config.h. 2016-07-06 15:57:04 -04:00
Christopher Kelly
df5c788ef2 Merge branch 'develop' into feature/multi_prec 2016-07-06 14:52:28 -04:00
Christopher Kelly
15f22425c8 Added option to prevent CG from exiting when it fails to converge 2016-07-06 14:50:01 -04:00
Guido Cossu
e87182cf98 Debugged the copy constructor of the Lattice class 2016-07-06 15:31:00 +01:00
Guido Cossu
e3d5319470 Debugged the real() and imag() functions and added tests to Test_Simd 2016-07-06 14:16:03 +01:00
Guido Cossu
ffedeb1c58 Minor modifications 2016-07-06 11:41:27 +01:00
Guido Cossu
3e3b367aa9 Small changes in the Log files 2016-07-05 15:05:28 +01:00
Guido Cossu
3e80947c2b Cleaned up HMC output. Tested smeared HMCs for single precision (OK) 2016-07-05 12:03:54 +01:00
Guido Cossu
fdfbf11c6d Merge branch 'develop' into temporary-smearing 2016-07-04 18:45:10 +01:00
Guido Cossu
9cb90f714e Merge remote-tracking branch 'origin/develop' into temporary-smearing 2016-07-04 17:28:40 +01:00
Guido Cossu
2daffdf95d Tested smeared WilsonRatio action, accepts 2016-07-04 16:17:28 +01:00
Guido Cossu
149f826601 Tested smearing for Nf2 WilsonFermionAction, non EO: accepts 2016-07-04 16:09:19 +01:00
Guido Cossu
cd8ee27080 Simple change in iGamma for smearing 2016-07-04 16:02:57 +01:00
Guido Cossu
0fa66e8f3c Debugged smearing for EOWilson, accepts 2016-07-04 15:35:37 +01:00
Guido Cossu
8dd099267d Corrected a bug in the Expression Templates (acso and asin were wrong) 2016-07-03 12:28:25 +01:00
Guido Cossu
1a6d65c6a4 Converted set_uw and set_fj to all complex functions 2016-07-03 10:27:43 +01:00
paboyle
fc4a043663 Colors and banner clean up 2016-07-02 16:15:38 +01:00
Guido Cossu
092fa0d8da Debugged set_fj,
to be fixed: BUG in imag()
2016-07-01 16:06:20 +01:00
e0b7004f96 Merge branch 'master' into feature/hadrons 2016-07-01 15:54:34 +01:00
paboyle
680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
paboyle
712b9a3489 Asm only for avx512 2016-06-30 14:35:02 -07:00
paboyle
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
paboyle
1445189361 COntrol the prefetch strategy 2016-06-30 14:35:02 -07:00
paboyle
05c884a62a Prefetch change 2016-06-30 14:35:01 -07:00
paboyle
a25bec87d9 Prefetch during save 2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle
51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu
565e9329ba Changed the colouring classes 2016-06-30 16:51:03 +01:00
Guido Cossu
5e02392f9c Fixed compilation error for benchmark_dwf
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
75fc295f6e Merge branch 'hadrons' into feature/hadrons 2016-06-14 17:51:15 +01:00
Richard Rollins
86187d7cca Removed write to stdout in constructor for MPI CartesianCommunicator 2016-06-14 15:34:20 +01:00
paboyle
87418e7df1 Slightly faster prefetching perf. 2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3 Prefetching in the normal kernel implementation. 2016-06-08 05:43:48 -07:00
paboyle
8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle
e503ef5590 Cleaned up 2016-06-07 00:11:36 +01:00
paboyle
a7682b0060 Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS 2016-06-06 23:48:21 +01:00
paboyle
d4c9d71fc8 Merge branch 'master' of https://github.com/paboyle/Grid 2016-06-06 07:06:54 -07:00
paboyle
786ca52c43 Problems remain in the red black preconditioning of the Ls vectorisation 2016-06-06 07:05:51 -07:00
Peter Boyle
f78d89bcbe Update Lebesgue.cc
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
1826ed06a3 Merge branch 'master' into hadrons 2016-05-27 16:50:31 +01:00
1c0e922585 Merge pull request #35 from aportelli/master
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe empty SIMD fix 2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457 Merge pull request #34 from aportelli/master
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
3ff96c502b Merge branch 'master' into hadrons 2016-05-12 19:24:18 +01:00
91e04056f9 fix of the empty SIMD 2016-05-12 19:24:10 +01:00
15a0908bfc Merge branch 'master' into hadrons 2016-05-12 18:35:46 +01:00
3789e3f31c additional fixed in slice functions 2016-05-12 18:35:38 +01:00
07f0b69784 Merge branch 'master' into hadrons 2016-05-12 13:02:18 +01:00
0c66719210 const fix in slice functions 2016-05-12 13:01:35 +01:00
362f255100 Hadrons: module parameters can now be accessed from outside 2016-05-12 11:59:28 +01:00
paboyle
3a5b5c8bec Save an old tar of tree 2016-05-12 03:20:17 -07:00
3d78ed03ef Merge branch 'master' into hadrons 2016-05-11 15:21:46 +01:00
4bc21ec7cb thread CL argument fix 2016-05-11 15:21:29 +01:00
e3083b6dfc Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab' 2016-05-11 15:20:41 +01:00
paboyle
ab89418658 Precision change going in; useful for mixed precision algorithms for example. 2016-05-11 15:18:47 +01:00
paboyle
28cd99882c Subslicing 2016-05-11 15:06:54 +01:00
paboyle
aceaee774c ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
312637e5fb Merge branch 'master' into hadrons
# Conflicts:
#	lib/Log.h
2016-05-04 12:16:18 -07:00
101aa769eb LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers 2016-05-04 12:15:31 -07:00
0bf99bfde5 log polish 2016-05-04 12:14:49 -07:00
64bf6fe54e macro to dump NERSC header to a stream 2016-05-04 12:14:38 -07:00
1161d566b9 minor code cleaning 2016-05-02 19:32:11 -07:00
d08d93c44c Merge branch 'master' into hadrons 2016-05-01 18:30:44 -07:00
c698b16d75 function to generate Chroma-style gamma matrix products 2016-05-01 18:30:35 -07:00
c4c89336fe SliceSum: shutting down warning about non-threaded code for now 2016-05-01 18:29:57 -07:00
fa59789580 ConjugateGradient: cleaner output 2016-05-01 18:29:20 -07:00
0ab10cdedb Merge branch 'master' into hadrons 2016-05-01 16:08:05 -07:00
92c2c7d3b5 SchurRedBlackDiagMooeeSolve: fix: guess was not initialised from input 2016-05-01 16:07:55 -07:00
e99ce0875f directly exit when using '--help' option 2016-05-01 16:05:16 -07:00
beb11fd4ef Merge branch 'master' into hadrons 2016-05-01 10:32:24 -07:00
paboyle
c23375cd65 Testing travis CI integration 2016-04-30 06:30:56 -07:00
paboyle
f7ca6ca889 Bernoulli reenabled -- using integral type for the discrete_distribution, but
then casts in the fill
2016-04-30 03:48:28 -07:00
paboyle
ec4a9b7f6c The Bernoulli gives a no compile due to a static assertion that the type be integral
in 4.7 random.h

Probably need to go through an Integer type, and then conver to real after the random draw
to make clean.
2016-04-30 03:42:24 -07:00
paboyle
5341977948 IMCI fixes. Thought I had committed these. The "real" disambiguation
between std::real and Grid::real shouldn't have been necessary and I don't
know why only the icpc v16.0 on babbage hits it.
May need a longer term rename of Grid::real or some careful EnableIf work.
2016-04-30 03:34:16 -07:00
dc5f32e5f0 Merge branch 'master' into hadrons 2016-04-30 00:18:31 -07:00
f6c53e5039 Merge commit '1e554350acae0e67fa7177ed0db9d4f684a54af2' 2016-04-30 00:17:52 -07:00
405b175665 Merge branch 'master' into hadrons 2016-04-30 00:16:06 -07:00
ba09cbae3e function to read std::vector from a string (blank separated values) 2016-04-30 00:15:44 -07:00
6aa000176f Fermion <-> Propagator functions 2016-04-30 00:14:33 -07:00
23b6172c31 Bernoulli RNG 2016-04-30 00:14:13 -07:00
3f128443ab OS X icpc fix 2016-04-30 00:13:33 -07:00
paboyle
1e554350ac The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug. 2016-04-29 16:49:18 -07:00
paboyle
c79ea0dcef Fixingn IMCI 2016-04-22 21:52:54 -07:00
paboyle
e3f141f82f Fixed SSE compile with typecasts 2016-04-22 10:30:30 -07:00
paboyle
a6dfa2386b GCC choked on intrinsics calls that ICPC did not 2016-04-22 06:33:41 -07:00
Peter Boyle
d9b5e66877 Update Make.inc 2016-04-20 18:25:48 +01:00
paboyle
8fd8bc25e9 simd 5th dim with rotation 2016-04-19 15:39:00 -07:00
paboyle
ba427abde9 simd 5d 2016-04-19 15:38:39 -07:00
paboyle
9b6ab6db16 simd in 5th dimension support 2016-04-19 15:38:01 -07:00
paboyle
806a83d38b simd in fifth dim support for dwf 2016-04-19 15:36:19 -07:00
paboyle
7223753355 Rotate in a direction > 2 for simd_layout 2016-04-19 15:35:15 -07:00
paboyle
b27bac4669 Updates for simd in one dir 2016-04-19 15:34:10 -07:00
paboyle
c8a93d6a93 Cartesian changes to allow all simd in one direction 2016-04-19 15:18:12 -07:00
paboyle
04072a5e1f Rotate is a temporary hack. Would like to merge ALL
permutes as rotates of length 2, and make any rotate active
over any subset of lane bits. This is hard, and requires general
permute; current intrinsics mean this is only really possible for specific
case by case encodings as presently performed. Intel could produce a general
permute.. would help. IBM did it in VMX.
2016-04-19 15:15:34 -07:00
paboyle
574ea4f843 const safety 2016-04-19 15:15:11 -07:00
paboyle
587f80cd93 Updated to compile and pass under intel SDE 2016-04-19 15:13:54 -07:00
paboyle
528eb773ad Merged.
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle
e5657510b0 Rotate support for Ls simd-ized 2016-04-19 22:24:18 +01:00
paboyle
f473919526 Rotate support 2016-04-19 22:23:51 +01:00
e33b0f6ff7 cleaner output 2016-04-16 08:41:53 +01:00
9ee54e0db7 debug output removed 2016-04-16 08:41:28 +01:00
Christopher Kelly
ab56ccdd25 -Complete and working implementation of Grid_empty 2016-04-15 13:17:42 -04:00
neo
339be37dba Debugging smeared HMC 2016-04-13 17:00:14 +09:00
neo
a87b744621 HMC runs but does not accept with smearing on 2016-04-07 16:45:11 +09:00
Christopher Kelly
a646260e82 Merge remote-tracking branch 'origin/master' into ckelly-dec12-2015 2016-04-06 13:57:28 -04:00
Christopher Kelly
af9c8d1372 -Checkerboard fixes for Lanczos 2016-04-06 13:50:56 -04:00
paboyle
b1192a8908 Benchmark_zmm added 2016-04-06 03:00:07 -07:00
paboyle
e8dddb1596 Adding extra benchmark 2016-04-06 10:32:54 +01:00
97d0d56bcb Debugging Smearing routines (set_fj) 2016-04-06 17:58:43 +09:00
paboyle
c7ba47bdc7 Merge branch 'master' of https://github.com/paboyle/Grid 2016-04-06 02:56:28 +01:00
7c7ea35ffb Putting the Traceless Antihermitian part outside the deriv in pseudofermion actions 2016-04-05 16:28:09 +09:00
4b1cf580e0 Debugging the Smearing routines 2016-04-05 16:19:30 +09:00
paboyle
e67fc2be18 Adding a trial for openmp overhead minimisation 2016-03-31 16:00:37 +01:00
paboyle
f473ef7591 Fixing the compile 2016-03-31 07:47:42 -07:00
paboyle
8052556275 Cleaning up the single/double kernel implementation switch 2016-03-31 14:51:32 +01:00
paboyle
60d965f79e AVX512 improvements; sigfpe trapping too 2016-03-30 08:42:34 +01:00
paboyle
83b15bfcdd Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign 2016-03-30 08:39:39 +01:00
paboyle
1ecbf9794d Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-30 08:37:55 +01:00
paboyle
2ded354403 configure 2016-03-30 00:17:43 -07:00
paboyle
340428a1fe Eigen fixes and HDCR work 2016-03-30 00:16:02 -07:00
paboyle
c77b7ee897 AddSub based alternate SU3 routine 2016-03-28 17:55:22 -06:00
paboyle
b6c3bc574b Moving to a more coherent organisation of the inline assembly and arch dependencies. 2016-03-28 16:24:37 +01:00
paboyle
1e355a51e1 Interface change 2016-03-27 23:46:55 -07:00
paboyle
ad80f61fba AVX512 shaken out 2016-03-28 00:38:05 -06:00
paboyle
21abaf7e91 Gamma sign change 2016-03-28 00:35:45 -06:00
paboyle
165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e Build avx512 clean 2016-03-25 09:35:33 -07:00
azusa
f54e0ec9bd Try lanczos to set up hdcr subspace 2016-03-17 10:36:16 +00:00
paboyle
60d4564151 ICC no compile fix 2016-03-16 02:30:40 -07:00
paboyle
d4e57f4bc6 IO Bandwidth reporting 2016-03-16 02:30:16 -07:00
paboyle
3920b2c0ab HDCR updates 2016-03-16 02:29:58 -07:00
paboyle
2733c4b93c hdcr updates 2016-03-16 02:29:37 -07:00
paboyle
36a800f26c Microsecond granularity support 2016-03-16 02:28:51 -07:00
paboyle
b75da563d9 Resurrect timestamp. Should make optional 2016-03-16 02:28:17 -07:00
paboyle
f9faec38be Printing fix under comms none 2016-03-16 02:27:53 -07:00
paboyle
d6b64f47d9 Uint64 sum for IO rates 2016-03-16 02:27:22 -07:00
paboyle
a359f7a9f5 Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-11 16:07:07 -08:00
paboyle
b606deb3f0 Uint64 gsum 2016-03-11 16:06:54 -08:00
paboyle
090e7aa930 Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle
2dce9c3cff HDCR running on 16^3 with 2x-3x speed up. 2016-03-08 01:01:50 -08:00
paboyle
dc72293398 More timing info 2016-03-06 10:46:55 -08:00
paboyle
e55c35734b Fix a nocompile 2016-03-03 20:33:28 +00:00
paboyle
325e745daa Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-02 07:04:03 -08:00
paboyle
61413565d0 Back off the inlined spin proj as not working 2016-03-02 07:03:09 -08:00
paboyle
ff129d9ad9 Redundant operations removed 2016-03-02 07:02:37 -08:00
paboyle
03fcd3b33a Back out of the colour 2016-03-02 07:01:15 -08:00
paboyle
68b02da483 Backing off the colour 2016-03-02 07:00:43 -08:00
paboyle
e051119769 extern "C" should have been in the header file, but Cray is apparently not C++ friendly. 2016-03-02 07:00:00 -08:00
2d8bb356e3 Smearing routines compile (still untested) 2016-02-25 02:43:59 +09:00
a7251f28c7 Stout smearing compiles (untested) 2016-02-24 03:16:50 +09:00
1eb169ac0b compatibility fix 2016-02-23 16:36:50 +00:00
5674c3e241 cycle count fix for x86 2016-02-23 16:08:18 +00:00
Antonin Portelli
497e7e4c53 BG/Q compatibility fix 2016-02-23 15:57:38 +00:00
19526d09c2 Merge commit '6aeaf6f568a391e34b913f08be6a11beb28d8842' 2016-02-22 15:23:26 +00:00
Peter Boyle
6aeaf6f568 Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
turned up problems on the BlueWaters Cray.

Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
Peter Boyle
40f2db9bc0 Disable metropolis step until 10 traj covered. Should move to exposing these
in XML input and start having "applications" directory.
2016-02-21 08:01:44 -06:00
Peter Boyle
2cfa20cc4e Improving the logging, got fed up with color so optionally disable.
Backtrace macro used everwhere
2016-02-21 07:58:53 -06:00
Peter Boyle
a5f683d124 Machine generated 2016-02-21 07:57:42 -06:00
Jung
9f0d9ade68 Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
neo
c1b1b89d17 More on smearing routines, writing APEsmear (dev) 2016-02-19 17:15:27 +09:00
neo
771235017d Adding smearing routines (development) 2016-02-19 15:30:41 +09:00
paboyle
3425751cb8 Missing return value 2016-02-19 01:06:03 +00:00
paboyle
db5e8050a8 Attempts at some optimisation 2016-02-18 22:33:58 +00:00
paboyle
a3fbabf404 Bug fix 2016-02-18 18:08:24 +00:00
Peter Boyle
22422a84d9 Small problem in compressor fix 2016-02-17 19:03:09 -06:00
Peter Boyle
c9fadf97a5 Simplify the compressor interface again. 2016-02-17 18:16:45 -06:00
Peter Boyle
c650bb3f3d Very small merge speed up. 2016-02-16 18:41:53 -06:00
Peter Boyle
81395e85d1 Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it. 2016-02-16 13:56:44 -06:00
Peter Boyle
340a29b735 More careful sequencing of comms 2016-02-15 16:04:59 -06:00
Peter Boyle
a0fc47c6f9 Cheaper implementation 2016-02-15 16:02:36 -06:00
Peter Boyle
42a9ac71d2 BUg fix, wait till complete. 2016-02-14 16:21:21 -06:00
Peter Boyle
41c2b09184 Shmem comms [NO MPI] target added. The dwf test runs and passes.
Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working.
But committing my current while I try a few experimentals.
2016-02-14 14:24:38 -06:00
paboyle
294dbf1bf0 Compile on OpenMPI shmem 2016-02-11 23:45:51 +00:00
Peter Boyle
9548c8b91f Had to break this out for universal access through the code base. 2016-02-11 07:40:09 -06:00
Peter Boyle
7f927a541c Shmem related fixes for shmem compile 2016-02-11 07:37:39 -06:00
paboyle
e2f73e3ead Updates for shmem 2016-02-10 16:50:32 -08:00
neo
6371676a75 Correcting some compilation errors for clang-sse 2016-02-10 11:37:03 +09:00
Jung
bd84c23298 definitions reconciled. 2016-01-25 16:30:59 -05:00
Jung
7aa8d5e8af Faiing to compile, comparing with master 2016-01-25 16:03:02 -05:00
Jung
6012b0ec23 Checking in changes before changing to chulwoo-dec12-2015 2016-01-25 09:40:58 -05:00
Jung
411ac49dd7 GparityWilsonTM typedef added. Not yet tested
Conflicts:
	configure
	lib/qcd/action/fermion/WilsonKernels.h
2016-01-25 01:36:28 -05:00
Jung
b8fb05a422 Addtional routines for Lanczos (SYM2, Chebyshef).. 2016-01-25 01:26:25 -05:00
Jung
5c57d4f403 Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
Conflicts:
	lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
paboyle
fc6ad65751 Pushed the overlap comms tweaks 2016-01-11 06:34:22 -08:00
paboyle
dafc74020c Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori 2016-01-10 16:54:27 -08:00
paboyle
d19321dfde Overlap comms compute changes 2016-01-10 19:20:16 +00:00
Jung
5924e5a562 Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
Conflicts:
	configure
	lib/qcd/action/Actions.h
	lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
c99d748da6 Timing reports in benchmarks now reflect the asynch comms thread statistics 2016-01-04 14:42:16 +00:00
paboyle
02452afd36 Optional overlap of comms with compute 2016-01-04 14:18:40 +00:00
paboyle
331768dcff Added overlap comms compute mode 2016-01-03 01:38:11 +00:00
paboyle
4aac345bea Updated logging to colour code according to message type 2016-01-02 17:21:14 +00:00
paboyle
15c0022042 GPLv2 clarified, and copyright message and banner in Init function.
Color is just showing off....
2016-01-02 15:22:30 +00:00
paboyle
aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
paboyle
1e68b1c1bd Create a benign default for gparity twists 2016-01-02 14:06:53 +00:00
paboyle
5a80930dd2 Charge conjugation boundary conditions for gauge fields implemented as a policy
class, changing the nature of covariant Cshifts used in
plaquettes, rectangles and staples.

As a result same code is used for the plaq and rect action independent of the BC type.

Should probably isolate the BC in a separate class that Gimpl takes as a template param.
Do the same with smearing policies.

This would then allow composition of BC with smearing etc....
2016-01-02 13:37:25 +00:00
paboyle
145a295231 Bug fix for stencil with large shifts (3+), would be important to naik term for example but did not
impact Wilson based nearest neighbour stencils.
2015-12-30 19:29:48 +00:00
paboyle
841a37f941 Fix to WilsonCompressor that fixes a bug in comms phase due to the sign change on gamma
matrix in hopping term.
Add logging of time spent in CG.
2015-12-29 23:49:41 +00:00
Azusa Yamaguchi
e6cad3821c Logging improvement 2015-12-29 19:51:18 +00:00
Azusa Yamaguchi
98de1cbb6a Optimised version of rectangle term staples.
~3.4x faster than the naive.
2015-12-29 19:22:59 +00:00
Azusa Yamaguchi
f7d61b8b81 Plaq plus rectangle and Iwasaki, Symanzik DBW2.
http://arxiv.org/pdf/hep-lat/0610075.pdf plaq and rect regress plausibly over 100 trajectories
and under HMC with average plaq and rectangle coming out ok.
2015-12-28 16:39:26 +00:00
Azusa Yamaguchi
78c4e862ef Plaq, Rectangle, Iwasaki, Symanzik and DBW2 workign and HMC regresses to http://arxiv.org/pdf/hep-lat/0610075.pdf 2015-12-28 16:38:31 +00:00
1e0be161e5 MacroMagic: inline functions to avoid double symbol issues 2015-12-23 14:20:05 +00:00
paboyle
0afcf1cf13 Moved all the HMC tests over to using a single HmcRunner class that manages checkpoint strategies and such like 2015-12-22 11:19:25 +00:00
paboyle
08edbb5cbe HMC bit repro across checkpoints. Fixed parallel RNG issue with threading.
Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine
per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure
repro across checkpoints.
2015-12-22 08:54:40 +00:00
paboyle
0abfbcc8eb Naming of files improvement. 2015-12-21 15:37:26 +00:00
paboyle
1b94253ba4 Logging improvement 2015-12-21 15:36:28 +00:00
paboyle
36e6f9ac7b Bug fix. Guess not initialised in refresh step; didn't hit before due to luck in not having a vector
created with NAN data.
2015-12-21 15:34:35 +00:00
paboyle
2f41691c11 Bug fix. Guess was not zeroed prior to CG call. Was earlier accidentally benign just due to luck. 2015-12-21 15:33:36 +00:00
paboyle
09bfe52840 Remove extraneous variable 2015-12-21 15:30:28 +00:00
paboyle
8c9010d0f4 Isnan check on guess and convergence assert on result 2015-12-21 15:29:46 +00:00
paboyle
42c583265c Remove timestamp 2015-12-21 15:28:03 +00:00
paboyle
539d698492 Prototypes for CML routines 2015-12-21 15:26:42 +00:00
paboyle
31ca609d12 HMC checkpointing .
Need a general HMC framework to work in restart.
2015-12-20 02:29:51 +00:00
paboyle
5710966324 Options to use mersenne twister OR ranlux48 via --enable-rng flag at configure time.
Can save and restore RNG state via new (serial) I/O routines in a NERSC header style file.
Store a Parallel (one per site) and a single serial RNG file.
2015-12-19 18:32:25 +00:00
paboyle
e108e708a3 Wilson TM tests and compiles in 2015-12-17 23:06:33 +00:00
paboyle
6f0198d4d9 Merge branch 'master' of https://github.com/paboyle/Grid 2015-12-17 22:34:54 +00:00
paboyle
67ccb043f1 Added TM fermions for DSDR etc.. 2015-12-17 22:34:28 +00:00
Azusa Yamaguchi
24a5a81c53 SSE compile fix 2015-12-16 09:09:37 +00:00
Jung
eb1759d7ea Added Gparity instantiation to no HANDOPT case
deleted configure (as intended?)
2015-12-16 00:04:09 -05:00
paboyle
34a0fde2ad Fixes to fermion force terms after sign of gamma_mu (0...3) change.
Thought I had already committed these.

Believe I have got the Gparity fermion force working.

* tests/Test_gpdwf_force.cc     -- correctly predicts dS for two flavour pseudofermion
                                   based on a small dt update of U field.

* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.

Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
Jung
bc34b7e808 Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
Conflicts:
	lib/qcd/action/fermion/WilsonKernels.h
	tests/Make.inc
2015-12-15 11:11:59 -05:00
Jung
284453c5e9 Added gparity mobius defs, added params to ScaledShamir
checking in before puling master
2015-12-14 12:15:06 -05:00
paboyle
af855cc129 Updating to fix peek poke to checkerboarded arrays since Chulwoo needs this. 2015-12-12 07:11:46 +00:00
paboyle
47fe6b5a7c Merge branch 'master' of https://github.com/aportelli/Grid into aportelli-master 2015-12-10 23:14:52 +00:00
paboyle
b3ef09a54d Merge branch 'master' of https://github.com/paboyle/Grid 2015-12-10 23:05:38 +00:00
paboyle
3ce10aa975 Fix a regression failure on Mobius; chroma regression added 2015-12-10 22:55:00 +00:00
Azusa Yamaguchi
a32a59fc43 Merge branch 'master' of https://github.com/paboyle/Grid 2015-12-09 12:48:44 +00:00
200de272ed IO: serialisable enums 2015-12-08 13:54:00 +00:00
d68a72e28b IO: code cleaning and string binary IO fix 2015-12-08 13:53:33 +00:00
17f9268a55 XmlIO: minor code cleaning 2015-12-07 18:30:00 +00:00
78f0c2595d autotool file accidentally committed 2015-12-07 18:28:06 +00:00
Jung
f2b4edc090 Fixes for Gparity comparison with CPS (Instantiation, Gamma matrix convention) 2015-12-07 02:04:57 -05:00
Jung
fb81acca3c Merge branch 'master' of https://github.com/paboyle/Grid 2015-12-03 12:11:10 -05:00
paboyle
93356fd246 No compile fixes on gcc/Cray 2015-11-29 03:14:44 -08:00
paboyle
ca42fe6d32 Merge branch 'master' of github.com:paboyle/Grid
Merge done
Conflicts:
	lib/serialisation/XmlIO.h
	tests/Test_stencil.cc
2015-11-28 17:03:43 -08:00
paboyle
6b97b271ae Integer divide useful 2015-11-28 17:01:20 -08:00
paboyle
fa01ae5980 integer divide 2015-11-28 17:00:34 -08:00
paboyle
113131b01c THis failed for some reason. Suspect Antonin has made more progress. 2015-11-28 16:59:59 -08:00
paboyle
b2c02a6106 Runs fastst on cori 2015-11-28 16:58:16 -08:00
paboyle
02d730513a Divide function 2015-11-28 16:54:43 -08:00
paboyle
d875c2bd39 More verbose useful 2015-11-28 16:54:19 -08:00
paboyle
cc32ba615a Verbose changes 2015-11-28 16:53:54 -08:00
paboyle
6684739452 Better to drop KMP_AFFINITY override 2015-11-28 16:52:44 -08:00
Peter Boyle
bc4b252883 Merge branch 'master' of https://github.com/paboyle/Grid 2015-11-29 00:33:01 +00:00
Peter Boyle
11cf0f08f3 This file is not yet debugged. 2015-11-29 00:32:45 +00:00
Peter Boyle
8a33846095 No compile fix 2015-11-29 00:29:58 +00:00
Peter Boyle
54f04ee5c9 Perf event interface was linux specfic and use ifdef to protect 2015-11-29 00:24:48 +00:00
Peter Boyle
825875fd48 compile fixes 2015-11-29 00:24:25 +00:00
Peter Boyle
f8290bfd58 Compile fixes 2015-11-29 00:24:04 +00:00
Azusa Yamaguchi
967be91692 update merge 2015-11-26 09:51:41 +00:00
06f8ecea04 Merge commit '899ca41cb8c8f47771bfd37cd895cbc2184e5560' 2015-11-16 18:16:25 +00:00
af19118113 new I/O interface 2015-11-16 18:14:37 +00:00
paboyle
e9ff25b06b Small threading change makes a difference on Cori. 2015-11-07 00:07:05 -08:00
paboyle
05a7029600 Stencil change 2015-11-07 00:06:31 -08:00
paboyle
b04b8914fd EXECINFO change 2015-11-07 00:05:57 -08:00
paboyle
899ca41cb8 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
paboyle
d29b4c1dee Assembler files 2015-11-06 03:48:48 -08:00
paboyle
a2ff068e29 Asm and threading for many core 2015-11-06 03:47:14 -08:00
paboyle
b362f8d27b Threading for many core 2015-11-06 03:46:41 -08:00
paboyle
64770d9052 Threading changes for many core and asm calls 2015-11-06 03:46:21 -08:00
paboyle
17af18dcab Changes for AVX512 assembler 2015-11-06 03:45:51 -08:00
Peter Boyle
28022755ae Stencil class name global change to StencilImpl typedef 2015-11-06 05:30:17 -06:00
Peter Boyle
955b482aaf Partial optimisation of the extraction/merger of simd vecs. 2015-11-06 05:26:20 -06:00
Peter Boyle
f9b2fce93b Changing whole stencil class to be template and not just single functions 2015-11-06 05:25:10 -06:00
Peter Boyle
473fa28a6c Partial optimisation; comms in x-dir for red black dslash will be slow as the checker skipping block strided
loops are non threadable. Will need to write a kernel for these instead and drive them with a lookup table
to make a look sufficiently simple to thread.
2015-11-06 05:23:23 -06:00
Peter Boyle
5d854c869c Stencil interface changes 2015-11-06 05:22:33 -06:00
Peter Boyle
880ff88362 Comms optimisation 2015-11-06 05:22:18 -06:00
Azusa Yamaguchi
4690acc3c8 Don't know why peter committed these as they didn't compile 2015-11-06 10:31:48 +00:00
Azusa Yamaguchi
3281745fde Exec info and linux check to stop non-portable code breaking 2015-11-06 10:31:24 +00:00
paboyle
1159de165c Asm option for AVX512 2015-11-05 22:04:51 -08:00
paboyle
16c7993434 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/simd/Grid_avx512.h
	lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle
6be9716e6f New file 2015-11-04 03:26:28 -08:00
paboyle
4a41c885ed Use Linux kernel interface to hardware performance counters. Dead useful. 2015-11-04 03:24:19 -08:00
paboyle
757b31ed42 Threading for KNC mods. 2015-11-04 03:22:14 -08:00
paboyle
ac7d1f26ad Either blocking or lebesgue curve 2015-11-04 03:19:16 -08:00
paboyle
1a8bf938b3 Use either sub-blocking or lebesgue 2015-11-04 03:18:51 -08:00
paboyle
63a2993827 Exec info an cache blocking 2015-11-04 03:16:56 -08:00
paboyle
4e65ad21ac Adding a routine for AVX512 / IMCI with explicit assembly implementations 2015-11-04 03:15:08 -08:00
Peter Boyle
dfc1de6f60 Merge branch 'master' of github.com:paboyle/Grid 2015-11-04 05:14:26 -06:00
Peter Boyle
3b7576ad53 Switch off for now 2015-11-04 05:13:29 -06:00
paboyle
9b5d31ffc1 mac , mult routines
Lines# with '#' will be ignored, and an empty message aborts the commit.
2015-11-04 03:10:34 -08:00
paboyle
a38762159c Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.
Updated Grid_avx512.h
2015-11-04 03:09:06 -08:00
Peter Boyle
ffc5dab17f AMD FMA4 support added for Interlagos/BlueWaters 2015-11-04 04:29:58 -06:00
Peter Boyle
96608c70d1 chrono causing some problems on Cray systems. Suspend use for now 2015-11-04 04:28:31 -06:00
Peter Boyle
d35d63b171 Algorithm in 2015-11-04 04:27:44 -06:00
Peter Boyle
24044dbc56 Debugged a problem with checkerboarded cshift in the checker dimension which arose
only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this
2015-11-04 10:00:55 +00:00
Peter Boyle
abb23df83f formatting only 2015-11-04 10:00:27 +00:00
Peter Boyle
12c5ec813c Useful debug messages (commented out) are included for preservation in case I need to revisit this 2015-11-04 09:59:27 +00:00
Peter Boyle
1271508ca2 Bug fix for spread out in x (EO) direction.
This is really annoying -- it is very hard to thread the loops with the index
recursion on buffer offset in the red-black case. Must think of a good threading
solution here.
2015-11-04 09:57:57 +00:00
Peter Boyle
ec5af35166 EO bug fix when spread out in x-direction 2015-11-04 09:56:58 +00:00
Peter Boyle
0f59356e86 Problem in comms fixed 2015-11-02 00:00:15 +00:00
8709117aea Log: generalised Logger class to allow separate logs in Grid-based applications 2015-10-27 17:31:13 +00:00
e6b9aa9076 Config.h removed form repository 2015-10-27 10:47:07 +00:00
Peter Boyle
8889af45ca FMA4 added 2015-10-09 01:00:53 +02:00
Peter Boyle
83afb2e26a Poly support for lanczos 2015-10-09 00:43:21 +02:00
Peter Boyle
6d06bd9493 Minor change in commented out code 2015-10-09 00:42:21 +02:00
Peter Boyle
6ee23f409e Lanczos addition 2015-10-09 00:41:00 +02:00
Peter Boyle
2d95dac6b6 Lanczos untested/partially tested additions. In middle of shake out but at least compiles 2015-10-09 00:40:25 +02:00
Peter Boyle
814c79f38d SIMD improvements for mac and madd use in complex for avx, sse 2015-10-09 00:38:52 +02:00
paboyle
1878bf97d0 Babbage fix 2015-09-30 16:04:01 -07:00
paboyle
a660ce716b No compile babbage fix 2015-09-30 16:02:44 -07:00
paboyle
f4b6d1dfea NGO stores reenabled 2015-09-30 16:02:14 -07:00
paboyle
23813ac798 No compile on babbage fix 2015-09-30 16:01:28 -07:00
Peter Boyle
9f4f65cb46 Added a decoupled memory system benchmark to remove thread synch overhead 2015-09-26 18:23:57 -07:00
Peter Boyle
64d64d1ab6 Updating to modify non-inlining permute routines and hopefully get better reg use and
enhance performance.
2015-09-25 08:55:04 -07:00
Peter Boyle
5ef42add2d Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00
Peter Boyle
2f38ebc446 Reintroducing the hand unrolled loops 2015-09-08 17:45:30 +01:00
Peter Boyle
638d6675ee Tested rms dH is ~ dt^4 numerically, so believe the ForceGradient is correct now.
Paranoia makes me want to diddle with the FG step to ensure dt^2 reappears.
2015-08-31 16:33:20 +01:00
Peter Boyle
357c6ab46d Reunitarise. Complete the HMC and integrator changes. 2015-08-31 16:32:04 +01:00
Peter Boyle
755dca9533 Added ForceGradient integrator. dH dropped so seems to work. Will only
believe it is right once I have pulled a dt^4 error scaling plot out.
2015-08-31 06:23:02 +01:00
Peter Boyle
29fd004d54 Unified integrator and integrator algorithm into virtual class used as a policy for the
HMC.
2015-08-30 13:39:19 +01:00
Peter Boyle
aa52fdadcc Global edit on HMC sector -- making GaugeField a template parameter and
preparing to pass integrator, smearing, bc's as policy classes to hmc.

Propose to unify "integrator" and integrator algorithm in a base/derived
way to override step. Want to read through ForceGradient to ensure
that abstraction covers the force gradient case.
2015-08-30 12:18:34 +01:00
Peter Boyle
76d752585b Started a tidy up in the HMC sector. Now comfortable with the two level integrators;
to a little figure out what Guido had done & why -- but there is a neat saving of force
evaluations across the nesting time boundary making use of linearity of the leapP in dt.

I cleaned up the printing, reduced the volume of code, in the process sharing printing
between all integrators. Placed an assert that the total integration time for all integrators
must match at end of trajectory.

Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both
Omelyan and with Leapfrog so substantial confidence gained.
2015-08-29 17:18:43 +01:00
Peter Boyle
dc814f30da Binary IO file for generic Grid array parallel I/O.
Number of IO MPI tasks can be varied by selecting which
dimensions use parallel IO and which dimensions use Serial send to boss
I/O.

Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes
doing the I/O.

Interpolates nicely between ALL nodes write their data, a single boss per time-plane
in processor space [old UKQCD fortran code did this], and a single node doing all I/O.

Not sure I have the transfer sizes big enough and am not overly convinced fstream
is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero.

Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations
on my MacOS + OpenMPI and Clang environment.

It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from
each node in order to gather bigger chunks at the syscall level.

That would push us up to the circa 8x 18*4*8 == 4KB size write chunk, and by taking, say, x/y non
parallel we get to 16MB contiguous chunks written in multi 4KB transactions
per IOnode in 64^3 lattices for configuration I/O.

I suspect this is fine for system performance.
2015-08-26 13:40:29 +01:00
Peter Boyle
612957f057 pull in original license. 2015-08-21 10:19:08 +01:00
Peter Boyle
cea8ac9a22 Credits to orig source where I found the macro tricks. 2015-08-21 10:14:53 +01:00
Peter Boyle
476da3ee62 Separated IO reader/writers into a proper abstract base,
derived relationship. Have Text/Binary/Xml versions of
Reader & Writer.

Any new Reader/Writer class inheriting the interface can give object serialisation
to any desired format now.

      new file:   lib/serialisation/BaseIO.h
      modified:   lib/serialisation/BinaryIO.h
      modified:   lib/serialisation/Serialisation.h
      modified:   lib/serialisation/TextIO.h
      modified:   lib/serialisation/XmlIO.h

The test uses the Xml, Binary and Text formats as well as cout << Object.
2015-08-21 10:06:33 +01:00
Peter Boyle
35818fdf6c Text and Binary readers 2015-08-20 23:04:38 +01:00
Peter Boyle
77d299b414 Cosmetic 2015-08-20 16:30:52 +01:00
Peter Boyle
ab81a25073 XMLReader implementation and a virtual Reader/Writer template framework.
Test_serialisation has an example of *code* *free* object serialisation
to both ostream and to XML using macro magic.

Implementing TextReader/TextWriter, YAML, JSON etc.. should be trivial
and we can use configure time options to select the default "Reader" typedef.

Present done with

"using XMLPolicy::Reader"

to pick up the default serialisation strategy.
2015-08-20 16:21:26 +01:00
Peter Boyle
fdfe194c41 Threading bug in RNG fill fixed. 2015-08-19 14:41:05 +01:00
Peter Boyle
4e085dd0ed Domain wall even-odd 2f HMC with wilson gauge and PV 2f ratio now running and giving small dH.
Azusa is working hard on the rectangle term and we'll hopefully start reproducing plaquettes
from RBC-UKQCD parameters soon !

My new laptop is pretty warm and is starting to groan ;)
2015-08-19 10:26:07 +01:00
Peter Boyle
e8d63c9178 Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-19 05:49:00 +01:00
Peter Boyle
c54c086f17 Even odd preconditioned one flavour ratio
(no support for non-const EE schur block)
2015-08-19 05:46:58 +01:00
Peter Boyle
dd6bb73ee0 Added one flavour rational ratios (unprec) 2015-08-19 04:58:40 +01:00
Peter Boyle
fc160eeccc Added one flavour rational ratios (unprec) 2015-08-19 04:58:40 +01:00
Peter Boyle
48db72259e EvenOdd schur decomposed mpcdagmpc version of rhmc determinant.
dH is also small and plaquette looks right.
2015-08-18 18:37:39 +01:00
Peter Boyle
570150f1d3 EvenOdd schur decomposed mpcdagmpc version of rhmc determinant.
dH is also small and plaquette looks right.
2015-08-18 18:37:39 +01:00
Peter Boyle
5c364f8082 One flavour rational unprec added; untested but does compile.
Moving param structs into a single header for later connection to file I/O using
macromagic.h
2015-08-18 14:40:08 +01:00
Peter Boyle
a842a6c94d One flavour rational unprec added; untested but does compile.
Moving param structs into a single header for later connection to file I/O using
macromagic.h
2015-08-18 14:40:08 +01:00
Peter Boyle
bdcbfe9310 Even Odd two flavour ratio added and dH == small 2015-08-18 10:37:08 +01:00
Peter Boyle
9306921ded Even Odd two flavour ratio added and dH == small 2015-08-18 10:37:08 +01:00
Peter Boyle
76f3855629 Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-18 09:23:58 +01:00
Peter Boyle
8621e2409f Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-18 09:23:58 +01:00
Peter Boyle
6212807a77 Small dh obtained in two flavour ratio so looks ok. 2015-08-18 09:21:29 +01:00
Peter Boyle
7622f0c441 Small dh obtained in two flavour ratio so looks ok. 2015-08-18 09:21:29 +01:00
Peter Boyle
0bc38a69ce Adding PV pseudofermion in prep for DWF HMC.
Not compiled this yet, but cloned in from BFM.
2015-08-18 09:19:42 +01:00
Peter Boyle
25d0eae50c Adding PV pseudofermion in prep for DWF HMC.
Not compiled this yet, but cloned in from BFM.
2015-08-18 09:19:42 +01:00
Peter Boyle
24382d77bb Adding PV pseudofermion in prep for DWF HMC.
Not compiled this yet, but cloned in from BFM.
2015-08-17 23:14:48 +01:00
Peter Boyle
ef6a9e6b07 Adding PV pseudofermion in prep for DWF HMC.
Not compiled this yet, but cloned in from BFM.
2015-08-17 23:14:48 +01:00
Peter Boyle
353d66def1 Unused apparently 2015-08-16 01:41:05 +01:00
Peter Boyle
b8166af92b Unused apparently 2015-08-16 01:41:05 +01:00
Peter Boyle
afeabe0d23 Tidying 2015-08-16 00:14:10 +01:00
Peter Boyle
6180487517 Tidying 2015-08-16 00:14:10 +01:00
Peter Boyle
53da927c3c Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-15 23:59:04 +01:00
Peter Boyle
f0e32f12cf Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-15 23:59:04 +01:00
Peter Boyle
155c164b0c * Finished the template/policy style introduction of gparity, except the gparity force terms.
So valence sector looks ok.

FermionOperatorImpl.h provides the policy classes.

Expect HMC will introduce a smearing policy and a fermion representation change policy template
param. Will also probably need multi-precision work.

* HMC is running even-odd and non-checkerboarded (checked 4^4 wilson fermion/wilson gauge).

There appears to be a bug in the multi-level integrator -- <e-dH> passes with single level but
not with multi-level.

In any case there looks to be quite a bit to clean up.

This is the "const det" style implementation that is not appropriate  yet for clover since
it assumes that Mee is indept of the gauge fields. Easily fixed in future.
2015-08-15 23:25:49 +01:00
Peter Boyle
55cfc89459 * Finished the template/policy style introduction of gparity, except the gparity force terms.
So valence sector looks ok.

FermionOperatorImpl.h provides the policy classes.

Expect HMC will introduce a smearing policy and a fermion representation change policy template
param. Will also probably need multi-precision work.

* HMC is running even-odd and non-checkerboarded (checked 4^4 wilson fermion/wilson gauge).

There appears to be a bug in the multi-level integrator -- <e-dH> passes with single level but
not with multi-level.

In any case there looks to be quite a bit to clean up.

This is the "const det" style implementation that is not appropriate  yet for clover since
it assumes that Mee is indept of the gauge fields. Easily fixed in future.
2015-08-15 23:25:49 +01:00
Peter Boyle
f40475f382 Reorganising the Fermion interface 2015-08-14 14:16:45 +01:00
Peter Boyle
ba8c09a58e Reorganising the Fermion interface 2015-08-14 14:16:45 +01:00
Peter Boyle
cc63078de5 Gparity works now even if simd distributed in a Gparity twist direction.
Tested by doubling lattice in t-direction.
2015-08-14 12:57:42 +01:00
Peter Boyle
59d66eb17a Gparity works now even if simd distributed in a Gparity twist direction.
Tested by doubling lattice in t-direction.
2015-08-14 12:57:42 +01:00
Peter Boyle
4dc7c36aa8 Gparity works now even if simd distributed in a Gparity twist direction.
Tested by doubling lattice in t-direction.
2015-08-14 12:57:42 +01:00
Peter Boyle
e6bed000c3 Gparity valence test now working.
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
028e2061e0 Gparity valence test now working.
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
7d3512ab21 Gparity valence test now working.
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
fc9b36c769 Gamma5 mult direct 2015-08-13 10:51:29 +01:00
Peter Boyle
2c216a42f9 Gamma5 mult direct 2015-08-13 10:51:29 +01:00
Peter Boyle
45b01858a8 Gamma5 mult direct 2015-08-13 10:51:29 +01:00
Peter Boyle
c39078162e Gparity improvements 2015-08-13 10:51:01 +01:00
Peter Boyle
145b807ba2 Gparity improvements 2015-08-13 10:51:01 +01:00
Peter Boyle
1c2d148bfa Gparity improvements 2015-08-13 10:51:01 +01:00
Peter Boyle
7e9203d8e0 Some bug fixes for more complicated types introduced with gparity 2015-08-13 10:50:34 +01:00
Peter Boyle
8d4c43327b Some bug fixes for more complicated types introduced with gparity 2015-08-13 10:50:34 +01:00
Peter Boyle
546513861f Some bug fixes for more complicated types introduced with gparity 2015-08-13 10:50:34 +01:00
Peter Boyle
6ab73c5512 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle
8a0be42080 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle
9183380946 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle
c8dca58e6d File list update. 2015-08-11 06:37:42 +01:00
Peter Boyle
ded3945467 File list update. 2015-08-11 06:37:42 +01:00
Peter Boyle
04e0e9f5a0 File list update. 2015-08-11 06:37:42 +01:00
Peter Boyle
826fbb18c4 Preconditioned conjugate residual 2015-08-11 06:24:53 +01:00
Peter Boyle
9cd7f9ecad Preconditioned conjugate residual 2015-08-11 06:24:53 +01:00
Peter Boyle
69ce87fbe4 Preconditioned conjugate residual 2015-08-11 06:24:53 +01:00
Peter Boyle
07d672baeb Header 2015-08-11 06:23:38 +01:00
Peter Boyle
26f5ee0621 Header 2015-08-11 06:23:38 +01:00
Peter Boyle
f165b1a120 Header 2015-08-11 06:23:38 +01:00
Peter Boyle
3903dfe6a5 Gparity modifications in the Gparity compressor variant. 2015-08-11 06:22:20 +01:00
Peter Boyle
881acaa065 Gparity modifications in the Gparity compressor variant. 2015-08-11 06:22:20 +01:00
Peter Boyle
0a9ebac514 Gparity modifications in the Gparity compressor variant. 2015-08-11 06:22:20 +01:00
Peter Boyle
1b3c93e22a Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
aeb7442d8f Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
84a66476ab Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
2be8df93ad Adding components for even odd decomposed determinant in HMC.
dH not yet conserved, so something wrong in the eo force code still
2015-08-07 08:37:15 +01:00
Peter Boyle
ce34856e32 Adding components for even odd decomposed determinant in HMC.
dH not yet conserved, so something wrong in the eo force code still
2015-08-07 08:37:15 +01:00
Peter Boyle
a01aa156b9 Adding components for even odd decomposed determinant in HMC.
dH not yet conserved, so something wrong in the eo force code still
2015-08-07 08:37:15 +01:00
Peter Boyle
5e9bef8a1b Merge branch 'master' of https://github.com/paboyle/Grid
Conflicts:
	lib/Make.inc
	lib/qcd/hmc/HMC.h
	tests/Make.inc
	tests/Test_hmc_WilsonFermionGauge.cc
2015-08-01 22:24:54 +09:00
Peter Boyle
a1d1dc96d6 Merge branch 'master' of https://github.com/paboyle/Grid
Conflicts:
	lib/Make.inc
	lib/qcd/hmc/HMC.h
	tests/Make.inc
	tests/Test_hmc_WilsonFermionGauge.cc
2015-08-01 22:24:54 +09:00
Peter Boyle
35feb93f56 Merge branch 'master' of https://github.com/paboyle/Grid
Conflicts:
	lib/Make.inc
	lib/qcd/hmc/HMC.h
	tests/Make.inc
	tests/Test_hmc_WilsonFermionGauge.cc
2015-08-01 22:24:54 +09:00
Peter Boyle
848104b1a9 Changes making force term test for DWF pass. 2015-08-01 22:06:07 +09:00
Peter Boyle
2994274267 Changes making force term test for DWF pass. 2015-08-01 22:06:07 +09:00
Peter Boyle
2157a6919a Changes making force term test for DWF pass. 2015-08-01 22:06:07 +09:00
Peter Boyle
8627e237c8 Jackson smoothed chebyshev and (untested) completion of force terms
for Cayley, Partial and Cont fraction dwf and overlap.
have even odd and unprec forces.
2015-08-01 05:58:35 +09:00
Peter Boyle
1d0be956ae Jackson smoothed chebyshev and (untested) completion of force terms
for Cayley, Partial and Cont fraction dwf and overlap.
have even odd and unprec forces.
2015-08-01 05:58:35 +09:00
Peter Boyle
1d67d29183 Jackson smoothed chebyshev and (untested) completion of force terms
for Cayley, Partial and Cont fraction dwf and overlap.
have even odd and unprec forces.
2015-08-01 05:58:35 +09:00
neo
bcdc67b152 Small change in the HMC interface.
Example of multiple levels in the WilsonFermion hmc test.

Merge remote-tracking branch 'upstream/master'

Conflicts:
	lib/qcd/hmc/HMC.h
	lib/qcd/hmc/integrators/Integrator.h
	lib/qcd/hmc/integrators/Integrator_algorithm.h
	tests/Test_simd.cc
2015-07-30 17:16:57 +09:00