1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-25 10:09:34 +01:00

Compare commits

..

1814 Commits

Author SHA1 Message Date
Peter Boyle
e307bb7528 Reorganise to abstract kernels that know about the lattice layout.
Move these back into grid.
2018-09-04 12:30:00 +01:00
Peter Boyle
5b8b630919 Finished the four quark optimisation for Bag parameters.
To do:

   Abstract the cache blocking from the contraction with lambda functions.
   Share code between PionFieldXX with and without momentum. Share with the Meson field code somehow.
   Assemble the WWVV in a standalone routine.
   Play similar lambda function trick for the four quark operator.
   Hack it first by doing the MesonField Routine in here too.
2018-08-28 14:11:03 +01:00
Peter Boyle
81287133f3 New files 2018-08-28 11:32:41 +01:00
Peter Boyle
bd27940f78 Reorder the loop 2018-08-28 11:32:23 +01:00
Peter Boyle
d45647698d Extra test code 2018-08-28 11:31:19 +01:00
Peter Boyle
d6ac6e75cc Some query functions 2018-08-28 11:30:51 +01:00
Peter Boyle
ba34d7b206 Pion Field test module 2018-08-28 11:30:14 +01:00
Peter Boyle
80003787c9 Use MKL DGEMM 2018-08-28 11:14:27 +01:00
Peter Boyle
f523dddef0 Remove verbose 2018-08-28 11:14:07 +01:00
Peter Boyle
3791a38f7c Optimised the MesonField a bit more 2018-08-01 08:27:27 +01:00
Peter Boyle
142f7b0c86 Updated the A2A Meson Field module 2018-07-31 15:58:02 +01:00
Peter Boyle
60c43151c5 Merge branch 'feature/hadrons-a2a' of https://github.com/paboyle/Grid into feature/hadrons-a2a 2018-07-31 01:09:02 +01:00
paboyle
e036800261 Eigen fix 2018-07-31 01:08:42 +01:00
Peter Boyle
62900def36 Merge branch 'feature/hadrons-a2a' of https://github.com/paboyle/Grid into feature/hadrons-a2a 2018-07-31 00:36:26 +01:00
paboyle
e3a309a73f Eigen happiness 2018-07-31 00:35:17 +01:00
Peter Boyle
00b92a91b5 Optimising 2018-07-28 23:46:22 +01:00
paboyle
65533741f7 7 moms 2018-07-28 16:17:47 +01:00
Peter Boyle
dc0259fbda Merge pull request #173 from fionnoh/feature/hadrons-a2a
Changes to meson field benchmark. Now includes the gammas in the fina…
2018-07-27 23:03:56 +01:00
Peter Boyle
131a6785d4 Merge branch 'feature/hadrons-a2a' into feature/hadrons-a2a 2018-07-27 23:03:42 +01:00
paboyle
44f4f5c8e2 Momentum loop 2018-07-27 23:00:16 +01:00
fionnoh
2679df034f Changes to meson field benchmark. Now includes the gammas in the final part of the naive method, both methods compute
lhs^dag*Gamma*rhs (previously Gamma*lhs^dag*rhs), and checks results.
2018-07-27 18:31:10 +01:00
paboyle
71e1006ba8 Updated meson field benchmark for dirac structures 2018-07-26 09:09:29 +01:00
cce339deaf Merge pull request #172 from fionnoh/feature/hadrons
feature/hadrons -> feature/hadrons-a2a
2018-07-25 17:20:19 +00:00
fionnoh
24128ff109 Changes needed for MF benchmark to work with comms correctly 2018-07-23 15:51:37 +01:00
fionnoh
34e9d3f0ca Moved the creation and resizing of the v and w high modes from the A2A class to the A2A module and made them an output of the module. This means that they have to be inputs of the contration modules and they will freed from memory if they are no longer needed. 2018-07-22 14:40:31 +01:00
fionnoh
c995788259 Added ImportUnphysicalFermion and included appropriate logic for 5d w vectors in A2A code 2018-07-21 00:08:11 +01:00
fionnoh
94c7198001 Added ZFIMPL to A2AMeson contraction 2018-07-20 23:08:22 +01:00
fionnoh
04d86fe9f3 Removed overly verbose print statement 2018-07-20 21:38:19 +01:00
fionnoh
b78074b6a0 Removed a Dminus from high mode v and removed duplication pf D_oo code 2018-07-20 16:55:24 +01:00
fionnoh
7dfd3cdae8 Inclusion of ExportPhysicalFermionSource that fixes a bug in the low mode w vectors 2018-07-20 15:45:43 +01:00
fionnoh
cecee1ef2c Merge branch 'develop' of github.com:paboyle/Grid into feature/hadrons 2018-07-20 13:37:50 +01:00
fionnoh
355d4b58be Merge branch 'feature/hadrons' of github.com:fionnoh/Grid into feature/hadrons 2018-07-19 16:07:54 +01:00
fionnoh
2c54a536f3 Moved the meson field inner product to its own header file 2018-07-19 15:56:52 +01:00
fionnoh
d868a45120 Cleaned up some stuff that was erroneously included in a previous "trash" commit. Leaving in the mySliceInnerProdct function for now as it speeds up mesonfield creation quite a lot for 24^3 tests 2018-07-16 16:19:59 +01:00
fionnoh
9deae8c962 A2A meson field contraction code 2018-07-16 14:18:45 +01:00
fionnoh
db86cdd7bd Possible trash commit 2018-07-10 13:30:45 +01:00
paboyle
ec9939c1ba Test for faster implementation of meson field inner loop
This should be possible to cache block at outer levels, global sum across nodes not performed
and deferred to caller to block them all into a big all reduce.
Nc=3 and Fermion is hard coded in an ugly way. We might think about benchmarking whether
a product without the conjugate should be made available by Grid.

It is not clear whether the explicit unroll, or the performing of conjugate on left once
was the real source of the speed up.

Gives 70-80 GF/s on my laptop (single) half that double, and 70GB/s to cache.

This is competitive with dslash and a reasonable stopping point for the optimisation. If necessary we can revisit.
2018-07-10 12:38:51 +01:00
fionnoh
f74617c124 Added ZFIMPL to meson field module 2018-07-03 14:04:53 +01:00
fionnoh
8c6a3921ed Merge remote-tracking branch 'upstream/feature/hadrons' into feature/hadrons 2018-07-03 11:35:14 +01:00
a8a15dd9d0 Hadrons: code cleaning 2018-07-02 17:52:39 +01:00
3ce68a751a Hadrons: stout smearing module 2018-07-02 17:52:04 +01:00
fionnoh
daa0977d01 Included a print statement that indicates that the guess is being subtracted from the solve. 2018-06-28 16:34:56 +01:00
fionnoh
a2929f4384 Removed A2A contraction module and replaced it with the beginnings of a meson field module 2018-06-28 16:17:26 +01:00
fionnoh
7fe3974c0a Included eigenPacks and action as references, not inputs, of A2A module. They now now longer need to be parameters in the meson field modules. 2018-06-28 16:14:49 +01:00
fionnoh
f7e86f81a0 Changes A2A class to make use of the new Solver class 2018-06-28 16:14:16 +01:00
fionnoh
fecec803d9 Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/hadrons 2018-06-28 16:13:43 +01:00
fionnoh
8fe9a13cdd Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/hadrons 2018-06-28 16:13:07 +01:00
d2c42e6f42 Hadrons: scaled DWF action 2018-06-26 14:59:33 +01:00
049cc518f4 Hadrons: introduction message 2 2018-06-25 19:08:39 +01:00
2e1c66897f Hadrons: introduction message 2018-06-25 19:08:22 +01:00
adcef36189 Hadrons: Möbius DWF action 2018-06-25 15:58:35 +01:00
fionnoh
2f121c41c9 Commiting reation of meson field code before a merge with the upstream branch feature/hadrons 2018-06-25 12:20:46 +01:00
e0ed7e300f Hadrons: spurious Dminus removed 2018-06-22 16:33:43 +02:00
485207901b Merge branch 'develop' into feature/hadrons 2018-06-22 16:15:32 +02:00
c760f0a4c3 Hadrons: remove make_5D/4D functions and FreeProp fix 2018-06-22 16:12:46 +02:00
c84eeedec3 Hadrons: GaugeProp module for z-Wilson actions 2018-06-22 15:53:22 +02:00
fionnoh
1ac3526f33 Small changes to the A2A header and module 2018-06-22 12:29:42 +01:00
fionnoh
0de090ee74 Temporarily added in the contraction code that produced the working 2-pt function. This is commited for reference only and will be removed in the next push. 2018-06-22 12:28:41 +01:00
91405de3f7 Hadrons: new solver exposing fermion matrix and generic source/solve import/export 2018-06-22 12:14:37 +02:00
fionnoh
8fccda301a Fixed a bug where the guess was always subtracted after the solve and included appropriate weights for the sources in the one case we're looking at now. More work needs to be done to make the 5d/4d source logic less brittle. 2018-06-21 16:36:59 +01:00
fionnoh
7a0abfac89 Restructured the class that computes and returns the A2A vectors. 2018-06-21 16:36:06 +01:00
fionnoh
ae37fda699 A more elegant way to subtract guesses from solve and a bool check before verifying residual 2018-06-20 16:07:40 +01:00
fionnoh
b5fc5e2030 All to all module update that hit a promising milestone. Commiting for a reference for future changes. 2018-06-20 10:59:07 +01:00
8db0ef9736 Merge pull request #168 from jch1g10/feature/qed-fvol
Feature/qed fvol
2018-06-08 20:09:06 +02:00
Guido Cossu
95d4b46446 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-06-08 11:30:29 +01:00
paboyle
5dfd216a34 Better thread safety 2018-06-04 21:08:44 +01:00
paboyle
c2e8d0aa88 Solve g++ problem on the lanczos test 2018-06-04 18:34:15 +01:00
James Harrison
0fe5aeffbb Merge branch 'feature/hadrons' into feature/qed-fvol 2018-06-04 16:59:43 +01:00
James Harrison
7fbc469046 Merge branch 'develop' into feature/hadrons 2018-06-04 16:58:30 +01:00
paboyle
bf96a4bdbf Merge branch 'master' into develop 2018-06-04 14:03:11 +01:00
paboyle
84685c9bc3 Overflow fix 2018-06-04 13:42:07 +01:00
fionnoh
a8d4156997 Added a Hadrons module that computes the all-to-all v and w vectors 2018-05-31 17:18:58 +01:00
fionnoh
c18074869b Changes to Hadrons SchurRB solver to allow for a subtract_guess boolean to be passed 2018-05-31 17:17:16 +01:00
fionnoh
f4c6d39238 CHanges made to SchurRB solvers to allow for the subtraction of a guess after solve 2018-05-31 17:16:20 +01:00
200d35b38a Merge branch 'develop' into feature/hadrons 2018-05-28 11:52:47 +02:00
eb52e84d09 Merge branch 'feature/hadrons' of github.com:paboyle/Grid into feature/hadrons 2018-05-28 11:50:27 +02:00
72abc34764 Merge pull request #166 from guelpers/feature/hadrons
Feature/hadrons
2018-05-28 11:49:46 +02:00
e3164d4c7b Hadrons: env function to get volume in double 2018-05-28 11:39:17 +02:00
James Harrison
f5db386c55 Change MODULE_REGISTER_NS -> MODULE_REGISTER in UnitEM, ScalarVP and VPCounterTerms 2018-05-22 16:16:21 +01:00
James Harrison
294ee70a7a Merge branch 'feature/hadrons' into feature/qed-fvol
# Conflicts:
#	extras/Hadrons/modules.inc
#	lib/qcd/action/gauge/Photon.h
2018-05-21 18:02:41 +01:00
Azusa Yamaguchi
013ea4e8d1 Merge branch 'feature/staggered-comms-compute' into develop 2018-05-21 13:11:56 +01:00
Azusa Yamaguchi
7fbbb31a50 Merge branch 'develop' into feature/staggered-comms-compute
Conflicts:
	lib/qcd/action/fermion/ImprovedStaggeredFermion.cc
2018-05-21 13:07:29 +01:00
Azusa Yamaguchi
0e127b1fc7 New file single prec test 2018-05-21 12:57:13 +01:00
Azusa Yamaguchi
68c028b0a6 Comment 2018-05-21 12:54:25 +01:00
255d4992e1 Hadrons: stochastic scalar SU(N) free field fix 2018-05-18 20:49:55 +01:00
a0d399e5ce Hadrons: yet other attempts at EMT NPR 2018-05-18 20:49:26 +01:00
fd3b2e945a Hadrons: don't right result with empty stem 2018-05-18 20:48:24 +01:00
b999984501 Merge branch 'develop' into feature/hadrons 2018-05-15 13:53:57 +01:00
Guido Cossu
7836cc2d74 No checksum output on log for scidac 2018-05-15 10:10:08 +01:00
a61e0df54b Travis fix for Lime 2018-05-14 19:56:12 +01:00
9d835afa35 Attempt at solving the FP exception in the QED code 2018-05-14 19:05:54 +01:00
5e3be47117 Hadrons: scalar SU(N) various fixes 2018-05-14 18:58:39 +01:00
48de706dd5 Merge branch 'develop' into feature/hadrons 2018-05-11 18:06:40 +01:00
f871fb0c6d check file is opened correctly in the Lime reader 2018-05-11 18:06:28 +01:00
93771f3099 Hadrons: scalar SU(N) stochastic free field 2018-05-10 22:29:48 +01:00
8cb205725b Merge branch 'develop' into feature/hadrons 2018-05-09 23:56:35 +01:00
9ad580d82f Hadrons: format fix 2018-05-07 21:38:15 +01:00
899f961d0d Hadrons: eigenvalue metadata saved with 16 significant digits 2018-05-07 21:37:03 +01:00
54d789204f more general implementation of the precision interface for serialisers 2018-05-07 21:17:46 +01:00
25828746f3 XML precision scientific with 16 digits by default 2018-05-07 21:04:31 +01:00
f362c00739 Hadrons: better handling of automatic directory creation 2018-05-07 19:43:40 +01:00
Guido Cossu
25d1cadd3b Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-05-07 18:55:09 +01:00
Guido Cossu
c24d53bbd1 Further debug of RNG I/O 2018-05-07 18:55:05 +01:00
2017e4e3b4 Hadrons: more verbose directory creation error 2018-05-07 18:12:22 +01:00
27a4d4c951 Hadrons: multi-file eigenpack in separate directory 2018-05-07 17:52:54 +01:00
2f92721249 Merge branch 'develop' into feature/hadrons 2018-05-07 17:26:47 +01:00
3c7a4106ed Trap for deadly empty comm thread option 2018-05-07 17:26:39 +01:00
3252059daf Hadrons: multi-file support for eigenpacks 2018-05-07 17:25:36 +01:00
paboyle
6eed167f0c Merge branch 'release/0.8.1' 2018-05-04 17:34:11 +01:00
paboyle
4ad0df6fde Bump volume for Gerardo 2018-05-04 17:33:23 +01:00
661381e881 Merge branch 'develop' into feature/hadrons 2018-05-04 14:52:17 +01:00
Peter Boyle
68a5079f33 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-05-04 14:13:54 +01:00
Peter Boyle
8634e19f1b Update 2018-05-04 14:13:35 +01:00
Azusa Yamaguchi
9ada378e38 Add timing 2018-05-04 10:58:01 +01:00
Vera Guelpers
9d9692d439 Fix double vs float in boundary phases 2018-05-03 16:40:16 +01:00
0659ae4014 Merge branch 'develop' into feature/hadrons 2018-05-03 16:20:22 +01:00
bfbf2f1fa0 no threaded stencil benchmark if OpenMP is not supported 2018-05-03 16:20:01 +01:00
dd6b796a01 Hadrons: scalar SU(N) volume factor fix 2018-05-03 16:19:17 +01:00
Vera Guelpers
52a856b4a8 FreeProp module for Hadrons 2018-05-03 12:33:20 +01:00
Vera Guelpers
04190ee7f3 5D free propagator for DWF and boundary conditions for free propagators 2018-05-03 12:31:36 +01:00
Azusa Yamaguchi
587bfcc0f4 Add Timing 2018-05-03 12:10:31 +01:00
Vera Guelpers
2700992ef5 Merge remote-tracking branch 'upstream/feature/hadrons' into feature/hadrons 2018-05-03 10:01:52 +01:00
Peter Boyle
8c658de179 Compressor speed up (a little); streaming stores 2018-05-02 17:52:16 +01:00
Guido Cossu
ba37d51ee9 Debugging the RNG IO 2018-05-02 15:32:06 +01:00
Azusa Yamaguchi
4f4181c54a Merge branch 'feature/staggered-comms-compute' of https://github.com/paboyle/Grid into feature/staggered-comms-compute 2018-05-02 14:59:13 +01:00
Guido Cossu
4d4ac2517b Adding Scalar field theory example for Scidac format 2018-05-02 14:36:32 +01:00
Guido Cossu
e568c24d1d Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-05-02 14:29:25 +01:00
Guido Cossu
b458326744 Checkpointer module update 2018-05-02 14:29:22 +01:00
Guido Cossu
6e7d5e2243 HMC: added Scidac checkpointer and support for metadata 2018-05-02 14:28:59 +01:00
Azusa Yamaguchi
b35169f1dd MultiShift for Staggered 2018-05-02 14:22:37 +01:00
Azusa Yamaguchi
441ad7498d add Iterative counter 2018-05-02 14:21:30 +01:00
Peter Boyle
6f6c5c549a Split off gparity 2018-05-02 14:11:23 +01:00
Peter Boyle
1584e17b54 Revert to fast versoin 2018-05-02 14:10:55 +01:00
Peter Boyle
12982a4455 Hypercube optimisation 2018-05-02 14:10:21 +01:00
Peter Boyle
172f412102 shmget reintroduce 2018-05-02 14:07:41 +01:00
Peter Boyle
a64497265d TIming 2018-05-02 14:07:28 +01:00
ca639c195f Merge branch 'develop' into feature/hadrons 2018-05-01 14:07:32 +01:00
edc28dcfbf Hadrons: scalar SU(N) 2-pt fix 2018-05-01 14:02:31 +01:00
Peter Boyle
c45f24a1b5 Improvements for tesseract 2018-04-30 21:50:00 +01:00
Dr Peter Boyle
aaf37ee4d7 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-27 11:45:13 +01:00
Dr Peter Boyle
1dddd17e3c Benchmark improvements from tesseract 2018-04-27 11:44:46 +01:00
paboyle
661f1d3e8e Merge branch 'release/0.8.0' into develop 2018-04-27 11:22:33 +01:00
paboyle
edcf9b9293 Merge branch 'release/0.8.0' 2018-04-27 11:13:19 +01:00
paboyle
fe6860b4dd Update with LIME library guard 2018-04-27 08:57:34 +01:00
paboyle
d6406b13e1 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-27 07:52:56 +01:00
paboyle
e369d7306d Rename 2018-04-27 07:51:44 +01:00
paboyle
9f8d63e104 Roll over version 2018-04-27 07:51:12 +01:00
paboyle
9b0240d101 Hot start test 2018-04-27 07:50:51 +01:00
paboyle
b27f0e5a53 Control over IO 2018-04-27 07:50:15 +01:00
paboyle
75e4483407 Stronger convergence test 2018-04-27 07:49:57 +01:00
Guido Cossu
0734e9ddd4 Debugging Scatter_plane_simple 2018-04-27 14:39:01 +09:00
paboyle
809b1cdd58 Bug fix for MPI running ; introduced last night 2018-04-27 05:19:10 +01:00
paboyle
1be8089604 Clean compile 2018-04-26 23:42:45 +01:00
paboyle
3e0eff6468 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-26 23:00:46 +01:00
paboyle
7ecc47ac89 Quenched test compile 2018-04-26 23:00:28 +01:00
paboyle
e9f1ac09de static 2018-04-26 23:00:08 +01:00
Peter Boyle
fa0d8feff4 Performance of CovariantCshift now non-embarrassing. 2018-04-26 17:56:27 +01:00
49b8501fd4 Merge branch 'develop' into feature/hadrons 2018-04-26 17:33:50 +01:00
d47484717e Hadrons: scalar SU(N) result handling improvement 2018-04-26 17:32:37 +01:00
Peter Boyle
05b44aef6b Merge branch 'develop' of https://github.com/paboyle/Grid into develop
Conflicts:
	benchmarks/Benchmark_su3.cc
2018-04-26 15:38:49 +01:00
Peter Boyle
03e9832efa Use macros for bare openmp 2018-04-26 14:50:02 +01:00
Peter Boyle
28a375d35d Force static 2018-04-26 14:49:42 +01:00
Peter Boyle
3b06381745 Guard bare openmp statemetn with ifdef 2018-04-26 14:48:57 +01:00
Peter Boyle
91a0a3f820 Improvement 2018-04-26 14:48:35 +01:00
Peter Boyle
8f44c799a6 Saving the benchmarking tests for Cshift 2018-04-26 14:48:03 +01:00
Azusa Yamaguchi
96272f3841 Merge staggered fix linear operator and reduction 2018-04-26 10:33:19 +01:00
Azusa Yamaguchi
5c936d88a0 Merge branch 'feature/staggered-comms-compute' of https://github.com/paboyle/Grid into feature/staggered-comms-compute 2018-04-26 10:18:37 +01:00
Azusa Yamaguchi
1c64ee926e Faster staggered operator with m^2 term trivial used 2018-04-26 10:17:49 +01:00
Azusa Yamaguchi
2cbb72a81c Provide info if EE term is trivial (m^2 factor)
Better timing in staggered 4d case
2018-04-26 10:10:07 +01:00
Azusa Yamaguchi
31d83ee046 Enable special treatment of constEE cases 2018-04-26 10:08:46 +01:00
Azusa Yamaguchi
a9e8758a01 Improvements to staggered tests timings 2018-04-26 10:08:05 +01:00
Azusa Yamaguchi
3e125c5b61 Faster linalg on CG optimised against staggered
Sum overhead is bigger for staggered
2018-04-26 10:07:19 +01:00
Azusa Yamaguchi
eac6ec4b5e Faster reductions, important on single node staggered 2018-04-26 10:03:57 +01:00
Azusa Yamaguchi
213f8db6a2 Microsecond resultion 2018-04-26 10:01:39 +01:00
Guido Cossu
6358f35b7e Debug of previous commit 2018-04-26 14:18:11 +09:00
Guido Cossu
43f5a0df50 More timers in the integrator 2018-04-26 12:01:56 +09:00
Guido Cossu
c897878776 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-26 11:31:57 +09:00
cc6eb51e3e Hadrons: macro refactoring for library portability 2018-04-25 16:49:14 +01:00
Vera Guelpers
507009089b Merge remote-tracking branch 'upstream/feature/hadrons' into feature/hadrons 2018-04-25 09:36:39 +01:00
paboyle
2baf193031 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-25 00:14:03 +01:00
paboyle
362ba0443a Cshift updates 2018-04-25 00:12:11 +01:00
paboyle
276a2353df Move constructor 2018-04-25 00:11:07 +01:00
b234784c8e Hadrons: scalar SU(N) takes operator pairs now 2018-04-24 19:52:12 +01:00
6ea2a8b7ca Hadrons: scheduler shows starting value 2018-04-24 19:51:47 +01:00
c1d0359aaa Hadrons: scalar SU(N) kinetic term saves trace 2018-04-24 19:51:22 +01:00
047ee4ad0b Hadrons: scalar SU(N) cleanup 2018-04-24 19:50:58 +01:00
a13106da0c Hadrons: scalar SU(N) gradient 2018-04-24 19:50:30 +01:00
75113e6523 Hadrons: Scalar SU(N) variable name update 2018-04-24 19:49:27 +01:00
325c73d051 Hadrons: module template update 2018-04-24 19:48:54 +01:00
b25a59e95e Hadrons: mitigation of GCC/Intel compiler bug not generating defaulted destructors 2018-04-24 17:20:25 +01:00
Guido Cossu
c5b9147b53 Correction of a minor bug in the su3 benchmark 2018-04-24 08:03:57 -07:00
Guido Cossu
64ac815fd9 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-24 17:27:38 +09:00
Guido Cossu
a1be533329 Corrected Flop count in Benchmark su3 and expanded the Wilson flow output 2018-04-24 01:19:53 -07:00
7c4533797f Hadrons: scalar SU(N) EMT improvement term optional 2018-04-23 22:46:39 +01:00
af84fd65bb Hadrons: missing dependency message improvement 2018-04-23 22:46:17 +01:00
6764362237 Hadrons: automatic directory creation fix 2018-04-23 18:45:39 +01:00
2fa2b0e0b1 Hadrons: Application header does not include all the modules 2018-04-23 17:57:17 +01:00
b61292f735 Hadrons: recursive mkdir function 2018-04-23 17:36:43 +01:00
ce7720e221 Hadrons: copyright update 2018-04-23 17:36:20 +01:00
853a5528dc Hadrons: template modules compilation optimisation 2018-04-23 17:35:01 +01:00
169f405c9c Hadrons: tests repaired 2018-04-23 12:48:34 +01:00
c6125b01ce Hadrons: Error and Warning channels always on 2018-04-23 12:48:17 +01:00
b0b5b34bff Hadrons: custom abort with module trace 2018-04-23 12:48:00 +01:00
1c9722357d Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/qcd/action/fermion/FermionOperator.h
2018-04-20 17:15:21 +01:00
141da3ae71 function to get tensor dimensions 2018-04-20 17:13:34 +01:00
94edf9cf8b HDF5: direct access to group for custom operations 2018-04-20 17:13:21 +01:00
c11a3ca0a7 vectorise/unvectorise in reverse order 2018-04-20 17:13:04 +01:00
paboyle
870b1a85ae Think I have the physical prop interface to CF and PF overlap right, but need a strong check/regression.
Only support Hw overlap, not Ht for now. Ht needs a new Dminus implemented.
2018-04-18 14:17:49 +01:00
paboyle
b5510427f9 physical fermion interface, cshift benchmark in SU3. 2018-04-18 01:43:29 +01:00
Guido Cossu
26ed65c8f8 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-17 12:03:32 +01:00
paboyle
f7f043d8cf Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-04-17 10:57:18 +01:00
paboyle
ddcaa6ad29 Master does header on Nersc 2018-04-17 10:48:33 +01:00
334da7f452 Hadrons: can trace which module is throwing an error 2018-04-13 18:45:31 +02:00
4669ecd4ba Hadrons: build improvement 2018-04-13 18:21:18 +02:00
4573b34cac Hadrons: scalar SU(N) 2-pt functions with momentum 2018-04-13 18:21:00 +02:00
17f57e85d1 Merge branch 'develop' into feature/hadrons 2018-04-06 22:53:11 +01:00
c8d4d184ee XML push fragment fix 2018-04-06 22:53:01 +01:00
17f27b1ebd Hadrons: eigenpack writer fix 2018-04-06 22:52:11 +01:00
a16bbecb8a Hadrons: more feedback 2018-04-06 19:38:20 +01:00
7c9b0dd842 Hadrons: top level name for eigenpack metadata 2018-04-06 19:32:22 +01:00
6b7228b3e6 Hadrons: better metadata for eigenpack 2018-04-06 19:29:53 +01:00
f117552334 post-merge fix 2018-04-06 18:38:46 +01:00
a21a160029 Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/serialisation/XmlIO.cc
2018-04-06 18:34:19 +01:00
1569a374a9 XML interface polish, XML fragments can be pushed into a writer 2018-04-06 18:32:14 +01:00
eddf023b8a pugixml 1.9 update 2018-04-06 16:17:22 +01:00
6b8ffbe735 Hadrons: genetic minimum value type fix 2018-04-06 15:41:31 +01:00
81050535a5 Hadrons: truncate eigenvalues when loading partial eigenpack 2018-04-06 13:48:58 +01:00
7dcf5c90e3 Hadrons: eigenpack must be referred by solver when used 2018-04-06 13:16:28 +01:00
9ce00f26f9 not special characters in std::vector operator<< 2018-04-04 17:44:56 +01:00
85c253ed4a Test_serialisation MPI fix 2018-04-04 17:19:34 +01:00
ccfc0a5a89 Hadrons: better string representation of module parameters 2018-04-04 17:19:22 +01:00
d3f857b1c9 Hadrons: proper metadata for eigenpacks 2018-04-04 16:36:37 +01:00
fb62035aa0 Hadrons: do not create RB coarse grids 2018-04-03 19:49:11 +01:00
0260bc7705 Hadrons: eigen pack writing only for boss node 2018-04-03 18:55:46 +01:00
68e6a58f12 Hadrons: several Lanczos fixes and improvements 2018-04-03 17:42:21 +01:00
640515e3d8 Merge branch 'develop' into feature/hadrons 2018-03-30 17:43:49 +01:00
paboyle
f089bf5629 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-03-30 16:17:26 +01:00
paboyle
276f113f28 IO uses master boss node for metadata. 2018-03-30 16:17:05 +01:00
97c579f637 Merge branch 'develop' into feature/hadrons 2018-03-30 16:04:44 +01:00
a13c109111 deterministic initialisation of field metadata 2018-03-30 16:03:01 +01:00
paboyle
ab6afd18ac Still compile if no LIME 2018-03-30 13:39:20 +01:00
paboyle
5bde64d48b Barrier required in parallel when we use ftell 2018-03-30 12:41:30 +01:00
paboyle
2f5add4d5f Creation of file 2018-03-30 12:30:58 +01:00
c5a885dcd6 I/O benchmark 2018-03-29 19:57:41 +01:00
a4d8512fb8 Revert "Lattice serialisation, just HDF5 for the moment"
This reverts commit 8a0cf0194f.
2018-03-27 17:55:42 +01:00
5ec903044d Serial IO code cleaning for std:: convention 2018-03-27 17:11:50 +01:00
8a0cf0194f Lattice serialisation, just HDF5 for the moment 2018-03-26 19:16:16 +01:00
1c680d4b7a Merge branch 'develop' into feature/hadrons 2018-03-26 13:52:44 +01:00
Guido Cossu
c9c073eee4 Changes in messages in test dwf mixedprec 2018-03-23 11:27:56 +00:00
Guido Cossu
f290b2e908 Fix to pass CI tests 2018-03-23 11:14:23 +00:00
Guido Cossu
5f8225461b Fencing mixedcg test propagator write. LIME is still optional in Grid 2018-03-23 10:37:58 +00:00
e9323460c7 Merge branch 'develop' into feature/hadrons 2018-03-22 10:48:37 +00:00
20e186a1e0 Merge pull request #158 from goracle/dev-pull
Make compilation faster by moving print of git hash.
2018-03-22 10:45:17 +00:00
Peter Boyle
6ef4af989b Merge pull request #159 from goracle/dev-precsafe
Add dimension check to precisionChange.
2018-03-22 10:41:53 +00:00
Dan H
ccde8b817f Add dimension check to precisionChange. 2018-03-21 20:58:04 -04:00
Dan H
68168bf72d Revert "Add dimension match check to precisionChange."
This reverts commit 8f601d9b39.
2018-03-21 20:51:38 -04:00
Dan H
e93d0feaa7 Merge branch 'dev-pull' of github.com:goracle/Grid into dev-pull 2018-03-21 20:39:30 -04:00
Dan H
8f601d9b39 Add dimension match check to precisionChange. 2018-03-21 20:38:19 -04:00
paboyle
5436308e4a Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-03-21 14:26:29 +00:00
paboyle
07fe7d0cbe Save file in current dir; print checksums 2018-03-21 14:26:04 +00:00
Guido Cossu
60b57706c4 Small bug fix in the shm file names 2018-03-21 13:57:30 +00:00
James Harrison
58c2f60b69 Merge branch 'feature/hadrons' into feature/qed-fvol 2018-03-20 20:19:18 +00:00
James Harrison
bfa3a7b3b0 Merge branch 'feature/hadrons' into feature/qed-fvol
# Conflicts:
#	extras/Hadrons/Modules.hpp
#	extras/Hadrons/Modules/MGauge/StochEm.cc
#	extras/Hadrons/modules.inc
2018-03-20 20:17:59 +00:00
paboyle
954e38bebe Put a username in the path 2018-03-20 18:16:15 +00:00
paboyle
b1a38bde7a Extra test for Gparity with plaquette action 2018-03-20 18:01:32 +00:00
Guido Cossu
2581875edc Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-03-19 18:00:08 +00:00
Guido Cossu
f212b0a963 Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/hadrons 2018-03-19 17:57:13 +00:00
Guido Cossu
62702dbcb8 Fixing bug in the Point sink causing NaNs 2018-03-19 17:56:53 +00:00
41d6cab033 Merge branch 'develop' into feature/hadrons 2018-03-19 13:30:21 +00:00
5a31e747c9 Merge commit 'd5ce66f6ab2c44a12def7b6d26df80d6e646b1fb' into feature/hadrons 2018-03-19 13:19:09 +00:00
cbc73a3fd1 Hadrons: CG guesser fix 2018-03-19 13:11:38 +00:00
Peter Boyle
6c6d43eb4e Drop RB on coarse space ; that was a mistake 2018-03-17 09:35:01 +00:00
Peter Boyle
e1dcfd3553 typo fix 2018-03-16 23:10:47 +00:00
Peter Boyle
888838473a 4GB clean the offsets in parallel IO for multifile records 2018-03-16 21:54:56 +00:00
Peter Boyle
01568b0e62 Add a new SHM option 2018-03-16 21:54:28 +00:00
Peter Boyle
d5ce66f6ab Extra SHM option 2018-03-16 21:37:03 +00:00
Guido Cossu
d86936a3de Eliminating deprecated lex_sites 2018-03-16 12:26:39 +00:00
d516938707 Hadrons: eigen packs I/O and deflation interface 2018-03-14 14:55:47 +00:00
72344d1418 Hadrons: change default Schur convention to DiagTwo 2018-03-13 17:10:54 +00:00
7ecf6ab38b Merge branch 'develop' into feature/hadrons 2018-03-13 16:11:59 +00:00
2d4d70d3ec Hadrons: LCL fixes 2018-03-13 16:10:36 +00:00
78f8d47528 Hadrons: environment access to derived objects 2018-03-13 16:10:16 +00:00
b85f987b0b Hadrons: error message channel verbose during profiling 2018-03-13 16:09:22 +00:00
f57afe2079 Hadrons: much cleaner eigenpack implementation, to be tested 2018-03-13 13:51:09 +00:00
Dan H
0fb84fa34b Make compilation faster by moving print of git hash. 2018-03-12 17:03:48 -04:00
Vera Guelpers
8462bbfe63 Gamma input for meson contraction with round brackets 2018-03-12 18:02:12 +00:00
229977c955 Hadrons: minor memory fix for ShiftProbe module 2018-03-09 21:56:27 +00:00
e485a07133 Hadrons: garbage collector debug output 2018-03-09 21:56:01 +00:00
paboyle
0880747edb Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-03-09 20:44:42 +00:00
paboyle
b801e1fcd6 fclose should be called through a call to close() 2018-03-09 20:44:10 +00:00
70ec2faa98 Hadrons: maximum iteration specified for tests and error if 0 2018-03-09 19:53:55 +00:00
2f849ee252 declaration fix 2018-03-08 23:34:00 +00:00
bb6ed44339 Merge branch 'develop' into feature/hadrons 2018-03-08 23:09:28 +00:00
360cface33 Grid tensor serialisation fully implemented and tested 2018-03-08 19:12:03 +00:00
Azusa Yamaguchi
80302e95a8 MILC Interface 2018-03-08 15:34:03 +00:00
caf2f6b274 Merge branch 'develop' of github.com:paboyle/Grid into develop 2018-03-08 09:52:25 +00:00
c49be8988b Grid tensor serialisation 2018-03-08 09:51:22 +00:00
971c2379bd std::vector to tensor conversion + test units 2018-03-08 09:50:39 +00:00
Guido Cossu
94b0d66e4c Merge pull request #157 from goracle/dev-pull
Add print of the current git hash on Grid init.
2018-03-08 16:09:28 +09:00
Dan H
5e8af396fd Add print of the current git hash on Grid init. 2018-03-07 13:11:51 -05:00
9942723189 Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/serialisation/BaseIO.h
2018-03-07 15:22:16 +00:00
a7d19dbb64 Merge branch 'develop' of github.com:paboyle/Grid into develop
# Conflicts:
#	lib/serialisation/BaseIO.h
2018-03-07 15:13:54 +00:00
90dbe03e17 Conversion of Grid tensors to std::vector made more elegant, also pair syntax changed to (x y) to avoid issues with JSON/XML 2018-03-07 15:12:32 +00:00
8b14096990 Conversion of Grid tensors to std::vector made more elegant, also pair syntax changed to (x y) to avoid issues with JSON/XML 2018-03-07 15:12:18 +00:00
Azusa Yamaguchi
b938202081 Overlapped Comm for Wilson DhopInternal 2018-03-07 14:08:43 +00:00
e79ef469ac Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/serialisation/BaseIO.h
2018-03-06 19:25:51 +00:00
485c5db0fe conversion of Grid tensors to nested std::vector in preparation for tensor serialisation 2018-03-06 19:22:03 +00:00
James Harrison
c793947209 Add overloaded Photon constructors, with default parameters for IR improvements and infinite-volume G(x=0). 2018-03-06 16:27:26 +00:00
3e9ee053a1 Merge branch 'develop' into feature/hadrons 2018-03-05 20:01:38 +00:00
dda6c69d5b Hadrons: scalar SU(N) shift probes 2018-03-05 20:00:29 +00:00
cd51b9af99 Torture yourself with namespace lookup 101 2018-03-05 19:58:13 +00:00
paboyle
c399c2b44d Guido broke the charge conjugate plaquette action with premature optimisation.
This sector of the code does not matter for anything other than Guido's quenched HMC
studies, and any plaq specific optimisations should be retained in a private branch
instead of destroying the code simplicity.
2018-03-05 12:55:41 +00:00
paboyle
af7de7a294 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-03-05 12:22:41 +00:00
paboyle
1dc86efd26 Finalize protection 2018-03-05 12:22:18 +00:00
f32555dcc5 Merge branch 'develop' into feature/hadrons 2018-03-03 15:31:52 +00:00
30391cb2eb Merge pull request #155 from fionnoh/develop
Some changes needed for deflation interface
2018-03-03 13:43:59 +00:00
e93c883470 Hadrons: basic GraphViz visualisation 2018-03-03 13:42:36 +00:00
Fionn O hOgain
2e88408f5c Some changes needed for deflation interface 2018-03-02 22:27:41 +00:00
fcac5c0772 Hadrons: scalar SU(N) fixes 2018-03-02 19:20:23 +00:00
90f4000935 Hadrons: scheduler debug less verbose 2018-03-02 19:20:01 +00:00
480708b9a0 Hadrons: safer error handling for HadronsXmlRun 2018-03-02 19:19:37 +00:00
c4baf876d4 Hadrons: graph consistency check 2018-03-02 18:40:18 +00:00
2f4dac3531 Hadrons: legal update 2018-03-02 18:10:58 +00:00
3ec6890850 Merge branch 'feature/hadrons' of github.com:paboyle/Grid into feature/hadrons 2018-03-02 17:56:08 +00:00
018801d973 Hadrons: legal update 2018-03-02 17:56:00 +00:00
1d83521daa Hadrons: scalar SU(N) EMT 2018-03-02 17:55:18 +00:00
fc5670c6a4 Merge pull request #151 from guelpers/feature/hadrons
Feature/hadrons
2018-03-02 17:54:43 +00:00
d9c435e282 Hadrons: Scalar SU(N) transverse projection module 2018-03-02 17:35:12 +00:00
614a0e8277 Hadrons: Scalar SU(N) utility functions 2018-03-02 17:34:23 +00:00
Vera Guelpers
aaf39222c3 update my fork and fixed conflicts 2018-03-02 17:08:08 +00:00
550142bd6a Hadrons: more code cleaning 2018-03-02 14:30:45 +00:00
c0a929aef7 Hadrons: code cleaning 2018-03-02 14:29:54 +00:00
37fe944224 Hadrons: scalar kinetic term 2018-03-02 14:14:11 +00:00
Vera Guelpers
315a42843f changes requested for the pull request 2018-03-02 11:47:38 +00:00
83a101db83 Hadrons: more LCL fixes 2018-03-02 11:05:02 +00:00
c4274e1660 Hadrons: LCL cleaning 2018-03-02 10:18:33 +00:00
ba6db55cb0 Hadrons: reverse last commit 2018-03-01 23:30:58 +00:00
e5ea84d531 Hadrons: LCL: orthogonalise coarse evec 2018-03-01 19:33:11 +00:00
15767a1491 Hadrons: LCL fine convergence test 2018-03-01 18:04:08 +00:00
4d2a32ae7a Hadrons: z-Mobius message fix 2018-03-01 18:03:44 +00:00
5b937e3644 Hadrons: VM memory profiling fix 2018-03-01 17:28:38 +00:00
e418b044f7 Hadrons: code cleaning 2018-03-01 12:57:28 +00:00
b8b05f143f Hadrons: Lanczos more conservative type names 2018-03-01 12:53:16 +00:00
6ec42b4b82 LCL: external storage fix 2018-03-01 12:27:29 +00:00
abb7d4d2f5 Hadrons: z-Mobius action 2018-02-27 19:32:19 +00:00
16ebbfff29 Hadrons: Schur convention globally defined through a macro 2018-02-27 18:45:23 +00:00
4828226095 Hadrons: prettier log 2018-02-27 14:43:51 +00:00
8a049f27b8 Hadrons: Lanczos code improvement 2018-02-27 13:46:59 +00:00
43578a3eb4 Hadrons: copyright update 2018-02-26 19:24:19 +00:00
fdbd42e542 Hadrons: first implementation of local coherence Lanczos 2018-02-26 19:22:43 +00:00
e7e4cee4f3 Merge branch 'develop' into feature/hadrons 2018-02-26 15:05:05 +00:00
James Harrison
ec3954ff5f QedFVol: Add input parameter G(x=0) for infinite-volume photon 2018-02-23 14:53:05 +00:00
Azusa Yamaguchi
0f468e2179 OverlappedComm for Staggered 5D and 4D. 2018-02-22 12:50:09 +00:00
James Harrison
8e61286741 Merge branch 'develop' into feature/qed-fvol 2018-02-20 15:33:35 +00:00
paboyle
4790e99817 Extra communicator free that I had missed.
Hard to audit them all as this is complex
2018-02-20 15:12:31 +00:00
paboyle
2dd63aa7a4 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-02-20 14:29:26 +00:00
paboyle
559a501140 Deflation interface for solvers 2018-02-20 14:29:08 +00:00
paboyle
945684c470 updates for deflation in the RB solver 2018-02-20 14:28:38 +00:00
Christopher Kelly
e30a80a234 Relaxed constraints on MPI thread mode when not using multiple comms threads 2018-02-15 17:13:36 +00:00
James Harrison
69e4ecc1d2 QedFVol: Fix single precision build error 2018-02-14 17:37:18 +00:00
James Harrison
5f483df16b Merge branch 'develop' into feature/qed-fvol 2018-02-14 16:35:04 +00:00
James Harrison
4680a977c3 QedFVol: set infinite-volume photon propagator to 1 at x=0,
so that momentum-spage photon propagator is non-negative.
Need to check whether this is sufficient for all volumes.
2018-02-14 16:30:09 +00:00
Vera Guelpers
de42456171 updated my fork and conflicts fixed 2018-02-14 13:57:56 +00:00
Vera Guelpers
d55212c998 restructure SeqConservedCurrent for DWF to need less memory 2018-02-14 10:45:18 +00:00
paboyle
c96483e3bd Whitespace only change 2018-02-13 11:39:07 +00:00
Vera Guelpers
c6e1f64573 Test for QED 2018-02-13 09:30:23 +00:00
paboyle
ae31a6a760 Move deflate to right class 2018-02-13 02:11:37 +00:00
paboyle
dd8f2a64fe INterface to suit hadrons on Lanczos 2018-02-13 02:08:49 +00:00
James Harrison
724cf02d4a QedFVol: Implement infinite-volume photon 2018-02-12 17:18:10 +00:00
paboyle
7b8b2731e7 Conj error for complex coeffs 2018-02-12 16:06:31 +00:00
paboyle
237a8ec918 Communicator leak fixed (I think) 2018-02-12 13:27:20 +00:00
Vera Guelpers
49a0ae73eb Insertion of photon field in seqential conserved current 2018-02-12 09:36:08 +00:00
James Harrison
315f1146cd QedFVol: Fix output of VPCounterTerms module. 2018-02-08 20:40:45 +00:00
James Harrison
9f202782c5 QedFVol: Change format of scalar VP output files, and save diagrams without charge factors for consistency with ChargedProp module. 2018-02-07 20:31:50 +00:00
James Harrison
594a262dcc QedFVol: Remove redundant file Communicator_mpi.cc 2018-02-07 11:37:01 +00:00
James Harrison
7f8ca54285 Merge branch 'develop' into feature/qed-fvol 2018-02-07 10:11:00 +00:00
James Harrison
c5b23c367e QedFVol: Fix segmentation fault when multiple propagator modules are used. 2018-02-05 11:46:33 +00:00
Vera Guelpers
b6fe03eb26 BugFix: Now the stochatic EM potential weight is generated when calling for the first time 2018-02-02 15:29:38 +00:00
James Harrison
f37ed4958b Implement IR improvement, with coefficients set in input file. 2018-02-02 11:56:51 +00:00
Peter Boyle
896f3a8002 Fix to MPI for Hokusai system 2018-02-01 18:51:51 +00:00
James Harrison
5f85473d6b QedFVol: Move Projection class into Result class 2018-02-01 16:16:13 +00:00
James Harrison
ac3b0ebc58 QedFVol: New structure for ChargedProp output files 2018-02-01 12:31:32 +00:00
Guido Cossu
f0fcdf75b5 Update README.md 2018-01-30 12:44:20 +01:00
Guido Cossu
53bffb83d4 Updating README with new SKL target 2018-01-30 12:42:36 +01:00
Guido Cossu
cd44e851f1 Fixing compilation error in FundtoHirep 2018-01-30 06:04:30 +01:00
Guido Cossu
fb24e3a7d2 Adding utilities for perf profiling 2018-01-29 11:11:45 +01:00
Guido Cossu
655a69259a Added support for GCC compilation for Skylake AVX512 2018-01-28 17:02:46 +01:00
James Harrison
4e0cf0cc28 QedFVol: Fix bug in ScalarVP.cc due to double use of temporary object. Still getting mpi3 errors when configured with enable-comms=mpi[-auto]. 2018-01-27 15:15:25 +00:00
Guido Cossu
507c4e9efc Correcting an missing semicolumn in avx512 2018-01-27 10:59:55 +01:00
James Harrison
cdf550845f QedFVol: Fix bugs in StochEm.cc and ChargedProp.cc (still only works without MPI). 2018-01-26 21:25:20 +00:00
James Harrison
3db7a5387b BROKEN: Adapted scalarVP, UnitEm and VPCounterTerms modules to new Hadrons. Currently getting an assertion error from Communicator_mpi3.cc when I try to run. 2018-01-26 16:33:48 +00:00
Guido Cossu
f8a5194c70 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-01-25 13:46:37 +01:00
Guido Cossu
cff3bae155 Adding support for general Nc in the benchmark outputs 2018-01-25 13:46:31 +01:00
James Harrison
90dffc73c8 Merge branch 'feature/hadrons' into feature/qed-fvol
# Conflicts:
#	extras/Hadrons/Modules.hpp
#	extras/Hadrons/Modules/MGauge/StochEm.cc
#	extras/Hadrons/Modules/MScalar/ChargedProp.cc
#	extras/Hadrons/Modules/MScalar/ChargedProp.hpp
#	extras/Hadrons/modules.inc
#	lib/communicator/Communicator_mpi.cc
2018-01-24 16:41:44 +00:00
a1151fc734 Hadrons: MPI-safe serial IO 2018-01-23 17:26:50 +00:00
James Harrison
ab3baeb38f Implement contractions and data output in functions; calculate diagrams S, X and 4C separately; output 2E and 2T instead of sunset_shifted, sunset_unshifted, tadpole_shifted, tadpole_unshifted; add comments. 2018-01-23 17:07:45 +00:00
Vera Guelpers
389731d373 changed SeqConservedSummed.hpp to work with new hadrons interface 2018-01-23 10:11:33 +00:00
6e3ce7423e Hadrons: don't display module list at startup (too long) 2018-01-22 20:04:05 +00:00
15f15a7cfd Merge branch 'develop' into feature/hadrons
# Conflicts:
#	extras/Hadrons/Modules.hpp
#	extras/Hadrons/modules.inc
2018-01-22 20:03:36 +00:00
0e5f626226 Hadrons: module for scalar operator divergence 2018-01-22 19:38:19 +00:00
Azusa Yamaguchi
97b9c6f03d No option for interior/exterior split of asm kernels since different directions get interleaved 2018-01-22 11:04:19 +00:00
Azusa Yamaguchi
63982819c6 No option to overlap comms and compute for asm implementation since different directions are interleaved
in the kernels, introducing if else structure would be too painful
2018-01-22 11:03:39 +00:00
Vera Guelpers
6fec507bef merged new hadrons interface 2018-01-22 10:09:20 +00:00
James Harrison
219b3bd34f Remove freeVpTensor object 2018-01-19 17:14:11 +00:00
Guido Cossu
b00d2d2c39 Correction of Representations compilation and small compilation error for Intel 17 2018-01-17 13:46:12 +00:00
Guido Cossu
f1b3e21830 Merge branch 'feature/clover' into develop 2018-01-17 10:07:42 +00:00
Guido Cossu
b7f8c5b823 Modify test to merge with the new Lanczos interface 2018-01-12 14:38:27 +00:00
Guido Cossu
3923683e9b Updating the feature/clover branch with the newest Hadron package 2018-01-12 13:35:51 +00:00
Guido Cossu
e199fda9dc Merge pull request #136 from pretidav/feature/clover
Feature/clover
2018-01-12 11:57:08 +00:00
7bb405e790 Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/communicator/Communicator_mpi3_leader.cc
#	lib/communicator/Communicator_shmem.cc
2018-01-11 18:50:15 +00:00
ec16eacc6a Hadrons: scalar SU(N) 2-pt function 2018-01-10 22:12:21 +00:00
pretidav
cf858deb16 Lanczos with 2 reps fixed (tobe tested) 2018-01-10 18:43:02 +01:00
David Preti
a3affac963 SU3 restored + output filename for mesons and baryons fixed. 2018-01-10 14:56:54 +01:00
d9d1f43ba2 Hadrons: code cleaning 2018-01-10 11:31:24 +00:00
b7cd721308 Hadrons: scalar SU(N) tr(mag^n) 2018-01-10 11:25:59 +00:00
29f026c375 Hadrons: scalar SU(N) tr(phi^n) 1-pt function 2018-01-10 11:01:03 +00:00
58c7a13d54 Hadrons: result file macro with trajectory number 2018-01-10 10:59:58 +00:00
Azusa Yamaguchi
24162c9ead Staggered overlap comms comput 2018-01-09 13:02:52 +00:00
paboyle
e564d11687 Allow resize of the shared memory buffers 2018-01-08 15:20:26 +00:00
paboyle
0b2162f375 Clean up 2018-01-08 14:06:53 +00:00
paboyle
5610570182 Synthetic test of lanczos 2018-01-08 11:36:39 +00:00
paboyle
44f65526e0 Simplify communicators 2018-01-08 11:35:43 +00:00
paboyle
43e48542ab Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2018-01-08 11:34:45 +00:00
paboyle
0b85f1bfc8 Simplify the communicator proliferation: mpi and none. 2018-01-08 11:33:47 +00:00
paboyle
9947cfbf14 Simplify number of communicator cases 2018-01-08 11:33:01 +00:00
paboyle
357badce5e Simplify communicator case proliferation 2018-01-08 11:32:16 +00:00
paboyle
0091eec23a Simplify communicator cases 2018-01-08 11:31:32 +00:00
paboyle
9e9c2962df Simplify comms layer proliferation 2018-01-08 11:30:22 +00:00
paboyle
bda97212a9 Simplify proliferation of comms layers 2018-01-08 11:29:20 +00:00
paboyle
b91282ad46 Simplify comms layer proliferation 2018-01-08 11:28:52 +00:00
paboyle
0a68470f9a Simplify comms layers 2018-01-08 11:28:30 +00:00
paboyle
6ecf280723 Simplify comms layer proliferation 2018-01-08 11:28:04 +00:00
paboyle
7eeab7f995 Simplify comms layers 2018-01-08 11:27:43 +00:00
paboyle
9b32d51cd1 Simplify comms layer proliferatoin 2018-01-08 11:27:14 +00:00
paboyle
7b3ed160aa Rationalise MPI options 2018-01-08 11:26:48 +00:00
paboyle
1a0163f45c Updated to do list 2018-01-08 11:26:11 +00:00
David Preti
9028e278e4 Trying to fix a bug with SU4 mesons (still under investigation) 2018-01-06 15:57:38 +01:00
dd62f2f371 Hadrons: log message fix 2017-12-29 16:58:44 +01:00
0d612039ed Hadrons: prettier Grid logging (non-intrusive) 2017-12-29 16:58:23 +01:00
e8ac75055c Hadrons: binary configuration loader 2017-12-27 14:24:29 +01:00
8b30c5956c Hadrons: copyright update 2017-12-26 14:16:47 +01:00
185da83454 Hadrons: new MIO module namespace, NERSC loader moved there 2017-12-26 14:05:17 +01:00
6718fa8c4f Merge branch 'feature/scalar_adjointFT' into feature/hadrons 2017-12-26 12:59:33 +01:00
pretidav
4ce63af7d5 Working on Hadrons with Hirep. (QCD is set for SU4) 2017-12-22 19:02:07 +01:00
Vera Guelpers
935cd1e173 conserved current insertion summed over Lorentzindex 2017-12-22 11:38:45 +00:00
Vera Guelpers
55e39df30f tadpole insertion for DWF 2017-12-22 11:36:31 +00:00
67c3fa0f5f Hadrons: all modules are now ported, more tests need to be done 2017-12-21 11:39:07 +00:00
65d4f17976 Hadrons: no errors when trying to recreate a cache 2017-12-19 20:28:32 +00:00
e2fe97277b Hadrons: getReference use is rare, empty by default 2017-12-19 20:28:04 +00:00
Guido Cossu
84f9c37ed4 Merge branch 'feature/scalar_adjointFT' of https://github.com/paboyle/Grid into feature/scalar_adjointFT 2017-12-19 15:43:55 +00:00
bcf6f3890c Hadrons: more fixes after test 2017-12-14 21:14:10 +00:00
591a38c487 Hadrons: VM fixes 2017-12-14 19:42:16 +00:00
James Harrison
581be32ed2 Implement infrared improvement for v=0 on-shell self-energy 2017-12-14 13:42:41 +00:00
842754bea9 Hadrons: most modules ported to the new interface, compiles but untested 2017-12-13 19:41:41 +00:00
James Harrison
6bc136b1d0 Add module for calculating diagrams required for HVP counter-terms 2017-12-13 17:31:01 +00:00
0887566134 Hadrons: scheduler back! 2017-12-13 16:36:15 +00:00
61fc50d616 Hadrons: better organisation of the VM 2017-12-13 13:44:23 +00:00
a9c8d7dad0 Hadrons: code cleaning 2017-12-13 12:13:40 +00:00
259d504ef0 Hadrons: first full implementation of the module memory profiler 2017-12-12 19:32:58 +00:00
f3a77f4b7f Merge branch 'feature/hadrons' into feature/hadrons-new-memory-model 2017-12-12 14:05:23 +00:00
26d7b829a0 Hadrons: error managed through expections 2017-12-12 14:04:28 +00:00
64161a8743 Hadrons: much simpler reference dependency 2017-12-12 13:08:01 +00:00
2401360784 Merge pull request #138 from guelpers/feature/hadrons
bug fix in sequential insertion of conserved vector current
2017-12-11 18:53:41 +01:00
Vera Guelpers
2cfb50cbe5 bug fix in sequential insertion of conserved vector current 2017-12-08 11:13:39 +00:00
f9aa39e1c4 global memory debug through command line flag 2017-12-07 14:40:58 +01:00
0fbf445edd Hadrons: object creation that get properly captured by the memory profiler 2017-12-06 16:51:48 +01:00
e78794688a memory profiler improvement 2017-12-06 16:50:25 +01:00
9e31307963 Merge branch 'feature/hadrons' into feature/hadrons-new-memory-model 2017-12-06 16:49:32 +01:00
29e2eddea8 Merge branch 'develop' into feature/hadrons-new-memory-model 2017-12-06 16:49:21 +01:00
0a038ea15a Merge branch 'develop' into feature/hadrons 2017-12-06 16:49:10 +01:00
62eb1f0e59 FermionOperator virtual destructor needed for polymorphism 2017-12-06 16:48:17 +01:00
5422251959 Hadrons: execution part moved in a new virtual machine class 2017-12-05 15:31:59 +01:00
paboyle
9579c9c327 Threading improvement 2017-12-05 14:12:22 +00:00
paboyle
3729c7a7a6 Clean up of test 2017-12-05 13:07:31 +00:00
paboyle
c24d4c8d0e Improved parallel RNG init 2017-12-05 13:01:10 +00:00
paboyle
a14038051f Improved AllToAll asserts 2017-12-05 11:43:25 +00:00
paboyle
3e560b9462 Faster RNG init 2017-12-05 11:42:05 +00:00
paboyle
d93c6760ec Faster code for split unsplit 2017-12-05 11:39:26 +00:00
paboyle
ae3b7713a9 Cold start doesnt need RNG 2017-12-05 11:36:31 +00:00
cbd8fbe771 Merge branch 'feature/hadrons' into feature/hadrons-new-memory-model 2017-12-03 19:48:56 +01:00
d391f05cb7 Merge branch 'develop' into feature/hadrons 2017-12-03 19:48:46 +01:00
3127b52c90 bootstrap script does not destroy Eigen is working offline 2017-12-03 19:48:34 +01:00
01f00385a4 Hadrons: genetic pair selection based on exponential probability 2017-12-03 19:47:40 +01:00
59aae5f5ec Hadrons: garbage collector clean temporaries 2017-12-03 19:47:11 +01:00
624246409c Hadrons: module setup/execute protected to forbid user to bypass execution control 2017-12-03 19:46:18 +01:00
2a9ebddad5 Hadrons: scheduler offline, minimal code working again 2017-12-03 19:45:15 +01:00
ff7afe6e17 Merge branch 'feature/hadrons' into feature/hadrons-new-memory-model 2017-12-01 19:45:44 +00:00
33cb509d4b Merge branch 'develop' into feature/hadrons 2017-12-01 19:45:32 +00:00
456c78c233 Merge branch 'develop' into feature/hadrons-new-memory-model 2017-12-01 19:45:12 +00:00
2fd4989029 Merge branch 'develop' of github.com:paboyle/Grid into develop 2017-12-01 19:44:31 +00:00
2427a21428 minor serial IO fixes, XML now issues warning when trying to read absent nodes, these becomes 2017-12-01 19:44:07 +00:00
514993ed17 Hadrons: progress on the interface, genetic algorithm freezing 2017-12-01 19:38:23 +00:00
paboyle
28ceacec45 Split/Unsplit working 2017-11-27 15:13:29 +00:00
paboyle
e6a3e375cf Debug 2017-11-27 15:10:22 +00:00
paboyle
4987edbd44 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-11-27 12:34:56 +00:00
paboyle
ad140bb6e7 Clean on multinode target after split 1 1 2 4 -> 1 1 2 2 2017-11-27 12:34:25 +00:00
paboyle
1f04e56038 Believe split/unsplit works, but need to make pretty 2017-11-27 12:33:08 +00:00
paboyle
4bfc8c85c3 Clean up verbose communicator create 2017-11-27 12:32:37 +00:00
azusayamaguchi
e55397bc13 Staggerd cg 2017-11-24 14:18:30 +00:00
a3fe874a5b Hadrons: everything is broken, repairing while implementing the new memory model 2017-11-22 23:27:19 +00:00
f403ab0133 gitignore update 2017-11-22 17:13:09 +00:00
paboyle
94b8fb5686 Debug in progress 2017-11-19 01:39:04 +00:00
Guido Cossu
1f1d77b01a Performance metrics for the Scalar Action force term 2017-11-14 10:01:48 +00:00
pretidav
6a15e2e8ef Added WilsonTwoIndexAntiSymmImpl instantiation in WilsonKernelsHand.cc (shoud not be necessary) 2017-11-12 14:16:19 +01:00
074d17429f Merge branch 'develop' into feature/scalar_adjointFT
# Conflicts:
#	lib/communicator/Communicator_mpi3.cc
2017-11-11 18:09:55 +00:00
Peter Boyle
25f73018f4 Merge pull request #135 from fionnoh/develop
Declaring virtual functions as pure virtual functions.
2017-11-09 23:19:08 +00:00
fionnoh
1d7ccc6b2c Declaring virtual functions as pure virtual functions. 2017-11-09 19:46:57 +00:00
pretidav
59d9ccf70c restored WilsonKernelsHand.cc and added Qtop to production codes 2017-11-08 22:02:32 +01:00
Azusa Yamaguchi
1860b1698c Fixed the bag on MPI_T at Cam 2017-11-08 09:03:01 +00:00
Azusa Yamaguchi
9b8d1cc3da Staggered Schur decomposed matrix norm changed to not be the Schur anymore :(
Carleton wanted this for multimass / multishift
2017-11-07 14:48:45 +00:00
James Harrison
0c668bf46a QedFVol: Write to output files from one process only. 2017-11-07 14:46:39 +00:00
Guido Cossu
149c3f9e9c Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-11-07 14:01:13 +00:00
Guido Cossu
c519aab19d Fixing the MPI memory leak in the communicators 2017-11-07 13:55:37 +00:00
paboyle
69929f20bb Destructor fix. Split Grid and MPI3 will not yet work without more effort from me. 2017-11-06 23:45:00 +00:00
James Harrison
840814c776 QedFVol: Patch to fix MPI communicators error 2017-11-06 16:34:55 +00:00
pretidav
a493429218 added Production tests for MixedRep, Adj, 2S, 2AS. Still missing QObs. The HMC is not printing correctly all the actions and forces. 2017-11-04 18:16:54 +01:00
pretidav
915f610da0 clover 2indexSymm hmc production test created. clover 2indexAsymm and clover mixed to be filled. 2017-11-04 01:17:06 +01:00
pretidav
c79606a5dc Test production code wilson clover. Still missing QObs measurement on-the-fly. 2017-11-03 22:46:32 +01:00
James Harrison
95af55128e QedFVol: Redo optimisation of scalar VP (extra memory requirements were not the problem), and undo optimisation of charged propagator (which seemed to be causing HDF5 errors, although I don’t know why). 2017-11-03 18:46:16 +00:00
James Harrison
9f2a57e334 QedFVol: Undo optimisation of scalar VP, to reduce memory requirements 2017-11-03 13:10:11 +00:00
James Harrison
c645d33db5 QedFVol: Redo optimisation of charged propagator, and fix I/O bug 2017-11-03 10:59:26 +00:00
James Harrison
e0f1349524 QedFVol: Undo optimisation of charged propagator 2017-11-03 09:22:41 +00:00
paboyle
360efd0088 Improved treatment of reverse asked for by chris.
Truncate the basis.
Power method renormalises
2017-11-02 22:05:31 +00:00
pretidav
7b42ac9982 added polyakov loop observable to the hmc 2017-11-02 21:58:16 +01:00
paboyle
c5c647e35e Merge branch 'feature/lanczos-reorg' into develop 2017-11-02 15:23:11 +00:00
a4e5fd1000 Merge branch 'feature/hadrons' into feature/hadrons-new-memory-model 2017-11-01 19:24:51 +00:00
682e7d7839 Merge branch 'develop' into feature/hadrons 2017-11-01 19:24:38 +00:00
Guido Cossu
8e057721a9 Anisotropic Clover term written and tested 2017-11-01 12:50:54 +00:00
Guido Cossu
fa5e4add47 Added support for anisotropy to the WilsonFermion class 2017-10-31 18:20:38 +00:00
James Harrison
79b761f923 Merge branch 'develop' into feature/qed-fvol
# Conflicts:
#	lib/communicator/Communicator_base.cc
2017-10-30 15:53:18 +00:00
James Harrison
0d4e31ca58 QedFVol: Calculate phase factors for momentum projections once per configuration only. 2017-10-30 15:46:50 +00:00
James Harrison
b07a354a33 QedFVol: output scalar propagator before FFT in spatial directions. 2017-10-30 14:20:44 +00:00
paboyle
27ea2afe86 No compile on comms == none fix 2017-10-30 01:14:11 +00:00
paboyle
78e8704eac Shaking out 2017-10-30 00:25:31 +00:00
paboyle
67131d82f2 Get subrank info from communicator constructor 2017-10-30 00:24:11 +00:00
paboyle
615a9448b9 Extended sub comm supported 2017-10-30 00:23:34 +00:00
paboyle
00164f5ce5 : 2017-10-30 00:22:52 +00:00
paboyle
a7f72eb994 SHaking out 2017-10-30 00:22:06 +00:00
paboyle
501fa1614a Communicator updates for split grid 2017-10-30 00:16:12 +00:00
paboyle
5bf42e1e15 Update 2017-10-30 00:05:21 +00:00
paboyle
fe4d9b003c More digits 2017-10-30 00:04:47 +00:00
paboyle
4a699b4da3 New rank can be found out 2017-10-30 00:04:14 +00:00
paboyle
689323f4ee Reverse dim ordering lexico support 2017-10-30 00:03:15 +00:00
Guido Cossu
749189fd72 Full clover force correct 2017-10-29 12:03:08 +00:00
Guido Cossu
f941c4ee18 Clover term force ok 2017-10-29 11:43:33 +00:00
paboyle
84b441800f Merge branch 'develop' into feature/lanczos-reorg 2017-10-27 14:21:38 +01:00
paboyle
1ef424b139 Split grid Y2K bug fix attempt 2017-10-27 14:20:35 +01:00
paboyle
aa66f41c69 Bug fix in the coarse restore...
Think this is nearly there
2017-10-27 10:29:34 +01:00
paboyle
f96c800d25 Passes reload of coarse basis 2017-10-27 09:43:22 +01:00
paboyle
32a52d7583 Move the local coherence lanczos into algorithms.
Keep the I/O in the tester. Other people can copy this method to write other I/O formats.
2017-10-27 09:04:31 +01:00
paboyle
fa04b6d3c2 Finished ? Verifying coarse evec restore 2017-10-27 08:18:29 +01:00
paboyle
7fab183c0e Better read test 2017-10-27 08:17:49 +01:00
paboyle
9ec9850bdb 64bit ftello update 2017-10-26 23:34:31 +01:00
paboyle
0c4ddaea0b Cleaning up 2017-10-26 23:31:46 +01:00
paboyle
00ebc150ad Mistake in string parse; interface is ambiguous and must fix. Is char * a file, or a XML buffer ? 2017-10-26 23:30:37 +01:00
paboyle
0f3e9ae57d Gsites error. Only appeared (so far) in I/O code for even odd fields 2017-10-26 23:29:59 +01:00
Azusa Yamaguchi
034de160bf Staggered updates : Schur fixed and added a unit test for Test_staggered_cg_schur.cc giving stronger check 2017-10-26 20:58:46 +01:00
Guido Cossu
76bcf6cd8c Deleting vscode settings file 2017-10-26 18:45:41 +01:00
Guido Cossu
91b8bf0613 Debugging force term 2017-10-26 18:23:55 +01:00
paboyle
14507fd6e4 Final? candidate for push back on the lanczos reorg feature 2017-10-26 16:25:01 +01:00
paboyle
2db05ac214 Test for split/unsplit in isolation 2017-10-26 07:48:03 +01:00
paboyle
31f99574fa Moving these out of algorithms 2017-10-26 07:47:42 +01:00
paboyle
a34c8a2961 Update to IRL; getting close to the structure I would like. 2017-10-26 07:45:56 +01:00
paboyle
ccd20df827 Better IRL interface 2017-10-26 01:59:59 +01:00
paboyle
e9be293444 Better messaging 2017-10-26 01:59:30 +01:00
paboyle
d577211cc3 Relax stoppign condition 2017-10-25 23:57:54 +01:00
paboyle
f4336e480a Faster converge time 2017-10-25 23:53:44 +01:00
paboyle
e4d461cb03 Messagign 2017-10-25 23:53:19 +01:00
paboyle
3d63b4894e Use existing functionality where possible 2017-10-25 23:52:47 +01:00
paboyle
08583afaff Red black friendly coarsening 2017-10-25 23:51:18 +01:00
paboyle
b395a312af Better error messaging 2017-10-25 23:50:37 +01:00
paboyle
66295b99aa Bit less verbose SciDAC IO 2017-10-25 23:50:05 +01:00
paboyle
b8654be0ef 64 bit safe offsets 2017-10-25 23:49:23 +01:00
paboyle
a479325349 Rewrite of local coherence lanczos 2017-10-25 23:48:47 +01:00
paboyle
f6c3f6bf2d XML serialisation of parms and initialise from parms object 2017-10-25 23:47:59 +01:00
paboyle
d83868fdbb Identity linear op added -- useful in circumstances where a linear op may or may not be needed.
Supply a trivial one if not needed
2017-10-25 23:47:10 +01:00
paboyle
303e0b927d Improvements for coarse grid compressed lanczos 2017-10-25 23:46:33 +01:00
paboyle
28ba8a0f48 Force spacing more nicely 2017-10-25 23:45:57 +01:00
Azusa Yamaguchi
f9e28577f3 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-10-25 21:07:56 +01:00
Guido Cossu
e0cae833da Merge branch 'develop' into feature/scalar_adjointFT 2017-10-25 10:49:50 +01:00
Guido Cossu
8a3aae98f6 Solving minor bug in compilation 2017-10-25 10:34:49 +01:00
Guido Cossu
8309f2364b Solving again the MPI comm bug with FFTs 2017-10-25 10:24:14 +01:00
Azusa Yamaguchi
cac1750078 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-10-24 23:30:36 +01:00
Guido Cossu
e17cd35151 Merge branch 'develop' into feature/scalar_adjointFT 2017-10-24 17:31:22 +01:00
Guido Cossu
ccdec7a7ab Merge branch 'develop' into feature/clover 2017-10-24 16:51:14 +01:00
Guido Cossu
93642d813d Merging 2017-10-24 16:48:05 +01:00
Guido Cossu
0bc381f982 Merge pull request #133 from pretidav/feature/clover
Feature/clover
2017-10-24 15:15:21 +01:00
Guido Cossu
2986aa76f8 Restoring Perfcounts 2017-10-24 13:32:02 +01:00
Guido Cossu
657779374b Adding vscode to gitignore 2017-10-24 13:27:17 +01:00
Guido Cossu
ec8cd11c1f Cleanup and prepare for pull request 2017-10-24 13:21:17 +01:00
Guido Cossu
cbda4f66e0 Debug of the field strength 2017-10-24 10:20:13 +01:00
Guido Cossu
6579dd30ff More debug test 2017-10-23 18:47:00 +01:00
Guido Cossu
031c94e02e Debugging process for the clover term 2017-10-23 18:27:34 +01:00
Guido Cossu
6391b2a1d0 Added test for Wilson and Clover fermions 2017-10-23 14:42:35 +01:00
Guido Cossu
2e50b55ae4 Changes in the Makefile to compile against Chroma on Linux 2017-10-23 13:32:26 +01:00
James Harrison
c433939795 QedFVol: Temporarily remove incomplete implementation of infinite-volume photon 2017-10-20 16:27:58 +01:00
James Harrison
b6a4c31b48 Merge branch 'feature/qed-fvol' of https://github.com/jch1g10/Grid into feature/qed-fvol 2017-10-20 16:25:07 +01:00
James Harrison
98b1439ff9 QedFVol: pass arbitrary input values to photon constructor in UnitEm 2017-10-20 16:24:09 +01:00
Guido Cossu
27936900e6 Putting the FG verbosity in the Integrator level 2017-10-18 13:08:09 +01:00
James Harrison
564738b1ff Add module for unit EM field 2017-10-17 14:03:57 +01:00
Guido Cossu
cd3e810d25 Merge branch 'develop' into feature/scalar_adjointFT 2017-10-17 11:31:14 +01:00
pretidav
317ddfedee updated test clover + first attempt derivative clove term (still missing spin part) 2017-10-16 02:47:33 +02:00
paboyle
e325929851 ALl codes compile against the new Lanczos call signature 2017-10-13 14:02:43 +01:00
paboyle
47af3565f4 Logging improvement; reunified the Lanczos codes 2017-10-13 13:23:07 +01:00
paboyle
4b4d187935 Reunified the Lanczos implementations 2017-10-13 13:22:44 +01:00
paboyle
9aff354ab5 Final version prior to reunification 2017-10-13 13:22:26 +01:00
paboyle
cb9ff20249 Approx tests and lanczos improvement 2017-10-13 11:30:50 +01:00
James Harrison
a80e43dbcf Added infinite-volume photon in Photon.h (not checked yet) 2017-10-11 16:44:51 -04:00
paboyle
9fe6ac71ea Starting reorg of Blocked lanczos 2017-10-11 10:12:07 +01:00
5c392a6ecc Merge commit 'bf58557fb1ec710c766e19c9a8809b0a352de239' into feature/scalar_adjointFT 2017-10-10 17:14:56 +01:00
Azusa Yamaguchi
f1fa00b71b Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-10-10 14:26:44 +01:00
paboyle
bf58557fb1 Block compressed Lanczos 2017-10-10 14:15:11 +01:00
paboyle
10cb37f504 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-10-10 14:09:44 +01:00
Azusa Yamaguchi
1374c943d4 Correct Schur operator called 2017-10-10 13:59:50 +01:00
paboyle
a1d80282ec cb factorise 2017-10-10 13:49:31 +01:00
paboyle
4eb8bbbebe Christop mods 2017-10-10 13:48:51 +01:00
paboyle
d1c6288c5f Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-10-10 13:38:40 +01:00
Azusa Yamaguchi
dd949bc428 Merge branch 'feature/staggering' into develop 2017-10-10 13:02:51 +01:00
Azusa Yamaguchi
bb7378cfc3 Schur for staggered 2017-10-10 12:02:18 +01:00
Azusa Yamaguchi
f0e084a88c Schur staggered 2017-10-10 10:00:43 +01:00
paboyle
153672d8ec Split CG testing 2017-10-09 23:20:58 +01:00
paboyle
08ca338875 Split grid communication 2017-10-09 23:19:45 +01:00
paboyle
f7cbf82c04 Better stdout/err debug 2017-10-09 23:18:48 +01:00
paboyle
07009c569a Comms splitting improvements 2017-10-09 23:16:51 +01:00
Guido Cossu
15d690e9b9 Adding the cartesian communicator destructor 2017-10-09 09:59:58 +01:00
63b2bc1936 Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/qcd/action/fermion/FermionOperatorImpl.h
2017-10-05 14:16:23 +01:00
David Preti
d810e8c8fb first attempt to write C terms in clover derivative. Some shifts to be fixed 2017-10-05 10:13:53 +02:00
Azusa Yamaguchi
09f4cdb11e Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2017-10-04 10:51:16 +01:00
Azusa Yamaguchi
1e54882f71 Stagger 2017-10-04 10:51:06 +01:00
Guido Cossu
27caff92c6 Merge branch 'feature/scalar_adjointFT' of https://github.com/paboyle/Grid into feature/scalar_adjointFT 2017-10-04 09:44:27 +01:00
d38cee73bf Scalar: easier Fourier acceleration parametrisation through -D flags 2017-10-03 17:29:34 +01:00
8784f2a88d post-merge fix 2017-10-03 14:38:10 +01:00
c497864b5d Merge commit 'd54807b8c0cd1a7658ff8563bb00d1137b987e3e' into feature/scalar_adjointFT
# Conflicts:
#	lib/communicator/Communicator_base.h
#	lib/communicator/Communicator_mpi.cc
#	lib/communicator/Communicator_mpit.cc
2017-10-03 14:27:54 +01:00
05c1c88440 Scalar: more action generalisation 2017-10-03 14:26:20 +01:00
paboyle
d54807b8c0 MPIT works with split grid now 2017-10-02 23:14:56 +01:00
Guido Cossu
f6ba2b95ce Merge branch 'develop' into feature/scalar_adjointFT 2017-10-02 15:19:20 +01:00
paboyle
5625b47c7d Merge branch 'feature/dwf-multirhs' into develop 2017-10-02 12:42:32 +01:00
paboyle
1edcf902b7 Macos ANON 2017-10-02 12:41:02 +01:00
paboyle
e5c19e1fd7 RB constructor change 2017-10-02 12:25:52 +01:00
paboyle
a11d0a33d1 Merge branch 'feature/dwf-multirhs' of https://github.com/paboyle/Grid into feature/dwf-multirhs 2017-10-02 11:42:07 +01:00
paboyle
4f8b6f26b4 Merge branch 'develop' into feature/dwf-multirhs 2017-10-02 11:41:49 +01:00
paboyle
073525c5b3 Small patch from cori 2017-10-02 03:38:21 -07:00
Azusa Yamaguchi
eb6153080a Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2017-10-02 08:56:33 +01:00
Guido Cossu
f7072d1ac2 Solving an annoying compilation error in json 2017-10-02 07:13:40 +01:00
a021933002 Scalar: SU(N) action change to t'Hooft scaling 2017-09-29 16:09:34 +01:00
James Harrison
b99622d9fb QedFVol: fix problem with JSON wanting gcc 4.9 2017-09-28 13:34:33 -04:00
937c77ead2 Merge branch 'develop' into feature/qed-fvol 2017-09-28 16:25:20 +01:00
95e5a2ade3 Merge pull request #116 from jch1g10/feature/qed-fvol
Feature/qed fvol
2017-09-25 15:08:33 +01:00
David Preti
56478d63a5 clover + test (valence) 2017-09-24 19:32:15 +02:00
df21668f2c memory profiler update 2017-09-22 14:21:18 +01:00
Guido Cossu
482368e9de Merge branch 'develop' into feature/scalar_adjointFT 2017-09-21 13:44:08 +01:00
paboyle
fddeb29d6b Bug fix with spreadout FFT 2017-09-21 11:10:08 +01:00
paboyle
a9ec5cf564 Christoph bug report integrate 2017-09-21 10:32:41 +01:00
Peter Boyle
946a8671b9 Merge pull request #129 from djm2131/feature/eofa
Add support for DWF with the exact one flavor algorithm
2017-09-21 10:15:21 +01:00
Azusa Yamaguchi
a6eeea777b Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-09-21 10:12:41 +01:00
Peter Boyle
771a1b8e79 Merge pull request #128 from paboyle/feature/CG-reliable-update
Feature/cg reliable update
2017-09-21 10:12:03 +01:00
Peter Boyle
bfb68e6f02 Merge pull request #130 from giltirn/gparity-handunroll
Gparity handunroll
2017-09-21 10:11:00 +01:00
Azusa Yamaguchi
77f7737ccc Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-09-19 14:28:01 +01:00
Guido Cossu
9a827d0242 Fixing a compilation error 2017-09-18 14:55:51 +01:00
Guido Cossu
999c623590 Solving a memory leak in Communicator_mpi 2017-09-18 14:39:04 +01:00
paboyle
18c335198a Merge branch 'hotfix/dirac-ITT-fix1' into develop 2017-09-16 18:19:02 +01:00
paboyle
f9df685cde Merge branch 'hotfix/dirac-ITT-fix1' 2017-09-16 18:18:48 +01:00
paboyle
17c5b0f152 Patching comparison point 2017-09-16 18:18:07 +01:00
paboyle
5918769f97 Subtle Naik term bug updated in Stencil; less on logical && with a function call on right 2017-09-16 12:51:26 +01:00
Guido Cossu
b542d349b8 Minor cosmetic changes 2017-09-15 11:48:36 +01:00
Guido Cossu
91eaace19d Added support for FFT accelerated updates 2017-09-15 11:33:45 +01:00
Guido Cossu
bbaf1ada91 Merge branch 'feature/json-fix' into develop 2017-09-08 16:02:08 +01:00
Guido Cossu
1950ac9294 Fixed the Intel compiler problem with the JSON classes 2017-09-08 15:18:59 +01:00
Guido Cossu
13fa70ac1a Merge branch 'develop' into feature/json-fix 2017-09-08 13:42:20 +01:00
Guido Cossu
7cb2b11f26 Fixing Intel compiler error for the JSON parser 2017-09-08 13:41:53 +01:00
Guido Cossu
1184ed29ae Merge pull request #124 from nmeyer-ur/feature/arm-neon
Added integer reduce functionality
2017-09-08 10:54:35 +02:00
paboyle
203c7bf6fa Merge branch 'hotfix/dirac-ITT-fix' into develop 2017-09-05 15:08:51 +01:00
paboyle
c709883f3f Merge branch 'hotfix/dirac-ITT-fix' 2017-09-05 15:08:16 +01:00
paboyle
aed5de4d50 Patching macos compile 2017-09-05 15:07:07 +01:00
paboyle
ba27cc6571 Mac os happiness 2017-09-05 15:00:16 +01:00
paboyle
d856327250 Merge branch 'release/dirac-ITT' into develop 2017-09-05 14:56:12 +01:00
paboyle
d75369cb56 Merge branch 'release/dirac-ITT' 2017-09-05 14:55:54 +01:00
Peter Boyle
bf973d0d56 SHM complete 2017-09-05 14:30:29 +01:00
Peter Boyle
837bf8a5be Updating to control the SHM allocation scheme under configure time options 2017-09-05 12:51:02 +01:00
Peter Boyle
c05b2199f6 Improvements to huge memory 2017-09-04 10:41:21 -04:00
Azusa Yamaguchi
a5fe07c077 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-09-04 14:10:15 +01:00
Azusa Yamaguchi
b83b2b1415 Stability improvement to BCG. Force m_rr hermitian beyond rounding. 2017-09-04 14:09:47 +01:00
James Harrison
91676d1dda Fix “MAP_ANONYMOUS undefined” error on OSX. 2017-09-01 15:48:30 +01:00
Peter Boyle
b331be9101 Better reporting 2017-08-31 11:32:57 +01:00
Peter Boyle
49c20a9fa8 Patch to reporting 2017-08-31 11:32:21 +01:00
paboyle
7359df3501 Full reporting for benchmark; save robustness factor 2017-08-31 10:42:35 +01:00
Christopher Kelly
59bd1fe21b Fix for 'perm' and 'local' not being set for hand-unrolled external-site Dslash, which caused incorrect behavior of G-parity kernel 2017-08-29 13:07:37 -07:00
a56e3b40c4 Merge branch 'develop' into feature/hadrons 2017-08-29 11:03:53 -06:00
Nils Meyer
4e907fef2c Merge remote-tracking branch 'grid/develop' into feature/arm-neon 2017-08-29 17:47:36 +02:00
Christopher Kelly
67888b657f Merge branch 'gparity-handunroll' of https://github.com/giltirn/Grid into gparity-handunroll 2017-08-29 09:52:05 -04:00
Christopher Kelly
74af885d4e Removed some no-longer-needed associated with G-parity hand unrolled kernel 2017-08-29 09:50:37 -04:00
James Harrison
ac3611bb19 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/qed-fvol 2017-08-29 11:53:37 +01:00
Christopher Kelly
d36d2fb40d Added ability to override default Ls in Benchmark_dwf 2017-08-28 06:53:56 -07:00
Peter Boyle
5b9267e88d Cleaner comms benchmark treatment for one node runs 2017-08-27 18:24:48 -04:00
paboyle
15fd4003ef Improving presentation of results 2017-08-27 13:46:02 +01:00
paboyle
4b4c2a715b fcntl.h needed 2017-08-26 11:38:04 +01:00
paboyle
54a5e6c1d0 Check if we get huge pages on linux. Larry Meadows piece of magic. 2017-08-25 22:36:08 +01:00
paboyle
73aeca7dea Merge branch 'feature/multi-communicator' into develop 2017-08-25 21:55:09 +01:00
paboyle
ad89abb018 Fix 2017-08-25 20:43:37 +01:00
paboyle
80c5bce5bb Merge branch 'develop' into feature/multi-communicator 2017-08-25 20:21:26 +01:00
paboyle
f68b5de9c8 No compile fix on Clang 2017-08-25 19:35:21 +01:00
Peter Boyle
d0f3d525d5 Optimal block size for KNL 2017-08-25 19:33:54 +01:00
Christopher Kelly
f365a83fae In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions 2017-08-25 14:24:11 -04:00
Peter Boyle
3a58217405 Updated 2017-08-25 14:29:53 +01:00
Peter Boyle
c289699d9a updated from cambridge mpi3 shakeout 2017-08-25 11:41:01 +01:00
Peter Boyle
c3b1263e75 Benchmark prep 2017-08-25 09:25:54 +01:00
Christopher Kelly
34a9aeb331 Reduced number of if-statement evaluations in G-parity unrolled kernel 2017-08-24 13:53:50 -07:00
5846566728 Merge branch 'develop' into feature/hadrons 2017-08-24 18:20:52 +01:00
102ea9ae66 CI update 2017-08-24 18:17:09 +01:00
James Harrison
cc4afb978d Fix bug in non-zero momentum projection 2017-08-24 17:31:44 +01:00
21b02760c3 Merge branch 'develop' into feature/hadrons 2017-08-24 17:05:45 +01:00
Peter Boyle
2bcb704af2 Merge pull request #121 from Lanny91/feature/hadrons
Feature/hadrons
2017-08-24 12:59:08 +01:00
paboyle
5fa386ddc9 FFT test compile fixed 2017-08-24 10:17:52 +01:00
Christopher Kelly
edabb3577f Imported Benchmark_gparity 2017-08-23 16:54:06 -04:00
Christopher Kelly
ce5df177ee Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl 2017-08-23 15:05:22 -04:00
Christopher Kelly
a0bb8e5b46 Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity 2017-08-23 14:44:40 -04:00
Christopher Kelly
46f88e6d72 G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary 2017-08-23 13:21:10 -04:00
David Murphy
dd8f1ea189 Vectorized Mobius EOFA Dperp + shift operation 2017-08-23 13:17:26 -04:00
Christopher Kelly
b61835c1a5 Added inplace version of intrinsic G-parity twist to hand-unrolled kernel 2017-08-23 12:33:48 -04:00
Azusa Yamaguchi
d9cd4f0273 Staggered multinode block cg debugged. Missing global sum.
Code stalls and resumes on KNL at cambridge. Curious.

CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
David Murphy
459f70e8d4 Check-in of working Mobius EOFA class and tests 2017-08-22 22:38:30 -04:00
Christopher Kelly
061e48fd73 Replaced slow unpack-repack in G-parity BC twist with intrinsics version 2017-08-22 18:12:12 -04:00
Christopher Kelly
ab50145001 Implemented first, unoptimized version of hand-unrolled G-parity kernels
Improved Test_gparity
2017-08-22 17:12:25 -04:00
paboyle
b49bec0cec MAP_HUGETLB portability fix 2017-08-20 03:08:54 +01:00
paboyle
ae56e556c6 finalise issue on new OPA revert 2017-08-20 02:53:12 +01:00
paboyle
1cdf999668 Moving multicommunicator into mpi3 also for threading 2017-08-20 02:39:10 +01:00
paboyle
11062fb686 Comms none fail fix 2017-08-20 01:37:07 +01:00
paboyle
383ca7d392 Switch off comms for now until feature/multi-communicator is merged 2017-08-20 01:27:48 +01:00
paboyle
a446d95c33 Trying to pass TeamCity and Travis 2017-08-20 01:10:50 +01:00
paboyle
be66e7dd95 Merge branch 'develop' into feature/multi-communicator 2017-08-19 23:12:38 +01:00
paboyle
6d0d064a6c Update TODO 2017-08-19 23:11:30 +01:00
paboyle
bfef525ed2 New benchmark prep 2017-08-19 23:10:12 +01:00
Peter Boyle
0b0cf62193 Fix mpi 3 interface change 2017-08-19 13:18:50 -04:00
Peter Boyle
7d88198387 Merge branch 'develop' into feature/multi-communicator 2017-08-19 13:03:35 -04:00
Peter Boyle
2f619482b8 Enable blocking stencil send 2017-08-19 12:53:59 -04:00
Peter Boyle
d6472eda8d Use mmap 2017-08-19 12:53:18 -04:00
Peter Boyle
9e658de238 Use Vector 2017-08-19 12:52:44 -04:00
Peter Boyle
bcefdd7c4e Align both allocator calls to 2MB 2017-08-19 12:49:02 -04:00
David Murphy
9d45fca8bc Implement MobiusEOFAFermioncache.cc 2017-08-17 23:45:36 -04:00
David Murphy
ac9e6b63c0 More re-import of Mobius EOFA 2017-08-17 19:28:53 -04:00
David Murphy
e140b3f802 Beginning to re-import Mobius EOFA 2017-08-16 23:36:23 -04:00
David Murphy
d9d3d30cc7 Minor clean-up 2017-08-16 20:57:51 -04:00
David Murphy
47a12ec7b5 Implement EOFA pseudofermion force and Shamir tests for G-parity and non G-parity cases 2017-08-16 19:50:08 -04:00
David Murphy
ec1e2f7a40 Add (mostly implemented) ExactOneFlavourRatio pseudofermion class and tests of Shamir heatbath and action 2017-08-16 12:38:59 -04:00
David Murphy
41f73ec083 Add ChronoForecast class for forecasting solutions across poles in the EOFA heatbath 2017-08-16 12:37:38 -04:00
Guido Cossu
fd367d8bfd Debugging the PointerCache 2017-08-16 09:42:57 +01:00
David Murphy
6d0786ff9d Typo fixes and check-in of G-parity action test for DWF 2017-08-15 22:47:00 -04:00
David Murphy
b7f93aeb4d Change CayleyFermion5D::SetCoefficientsInternal to virtual to allow overriding in derived EOFA classes 2017-08-15 14:18:51 -04:00
David Murphy
202a7fe900 Re-import DWF and abstract base EOFA fermion classes and tests 2017-08-15 13:36:08 -04:00
Guido Cossu
8d168ded4a Correction of the dagger version of the Clover 2017-08-15 10:50:44 +01:00
Guido Cossu
8a3fe60a27 Added more asserts at grid creation time 2017-08-08 11:36:20 +01:00
Guido Cossu
44051aecd1 Checking for integer divisions in cartesian full 2017-08-08 10:31:12 +01:00
Guido Cossu
06e6f8de00 Check that the reduced dim is an integer 2017-08-08 10:22:12 +01:00
Guido Cossu
dbe4d7850c Make a test file compatible with all architectures 2017-08-06 10:49:45 +01:00
Guido Cossu
4fe182e5a7 Added high level HMC support for overriding default SIMD lane decomposition 2017-08-06 10:46:19 +01:00
Guido Cossu
75ee6cfc86 Debugging the Clover term 2017-08-04 16:08:07 +01:00
Guido Cossu
fde71c3c52 Merge branch 'develop' into feature/clover 2017-08-04 12:19:57 +01:00
Guido Cossu
175f393f9d Binary IO error checking 2017-08-04 12:14:10 +01:00
Christopher Kelly
7d867a8134 Merge branch 'develop' into feature/CG-reliable-update 2017-08-02 09:48:04 -04:00
Christopher Kelly
9939b267d2 Added switching to fallback linear operator in reliable update CG, and added recalculation of b parameter on update. 2017-07-31 13:39:44 -04:00
Lanny91
323e9c439a Hadrons: Legal banner fixes 2017-07-31 12:26:34 +01:00
Lanny91
28396f1048 Merge branch 'feature/rare_kaon' of https://github.com/Lanny91/Grid into feature/hadrons 2017-07-31 12:19:54 +01:00
Lanny91
67b34e5789 Modified conserved current 5th dimension loop for compatibility with 5D vectorisation. 2017-07-31 11:35:01 +01:00
Peter Boyle
14d53e1c9e Threaded MPI calls patches 2017-07-29 13:08:10 -04:00
Guido Cossu
8bd869da37 Correcting a bug in the IO routines 2017-07-27 15:12:50 +01:00
Guido Cossu
c7036f6717 Adding checks for libm and libstdc++ 2017-07-27 11:15:09 +01:00
Guido Cossu
c0485d799d Explicit parameter declaration in the WilsonGauge test 2017-07-26 16:26:04 +01:00
Guido Cossu
7abc5613bd Added smearing to the topological charge observable 2017-07-26 16:21:17 +01:00
Guido Cossu
237cfd11ab Solving the spurious O2 flags 2017-07-26 12:08:51 +01:00
Guido Cossu
a4b7dddb67 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-07-26 12:07:38 +01:00
Guido Cossu
5696781862 Debug error in Tensor mult 2017-07-26 12:07:34 +01:00
Christopher Kelly
8f4b3049cd Merge branch 'feature/CG-reliable-update' into ckelly_develop 2017-07-25 11:55:26 -04:00
Christopher Kelly
2a6e673a91 Merge branch 'develop' into feature/CG-reliable-update 2017-07-25 11:54:43 -04:00
Christopher Kelly
9b6cde173f Merge branch 'feature/CG-reliable-update' into ckelly_develop 2017-07-25 11:51:08 -04:00
Christopher Kelly
9f280b82c4 Added mixed-precision CG with reliable updates 2017-07-25 11:30:41 -04:00
c3f0889eda Merge pull request #123 from giltirn/develop
Fix for 'using namespace' in lib/qcd/utils/GaugeFix.h
2017-07-25 11:32:02 -03:00
Nils Meyer
7a53dc3715 Added integer reduce functionality 2017-07-24 11:12:59 +02:00
Christopher Kelly
0f214ad427 Moved FourierAcceleratedGaugeFixer into Grid::QCD namespace and removed 'using namespace' directives from header 2017-07-21 11:13:51 -04:00
Peter Boyle
fe4912880d Update README.md 2017-07-17 09:53:07 +01:00
Lanny91
875e1a841f Hadrons: updated Quark -> MFermion/GaugeProp module name in test. 2017-07-16 13:47:00 +01:00
Lanny91
0366288b1c Hadrons: added tests for 3pt contractions. 2017-07-16 13:45:55 +01:00
Lanny91
6293d438cd Hadrons: sink smearing compatibility for 3pt contraction modules. 2017-07-16 13:43:25 +01:00
Lanny91
852ade029a Hadrons: Added module to sink a propagator 2017-07-16 13:41:47 +01:00
Peter Boyle
f038c6babe Update README.md 2017-07-14 22:59:16 +01:00
Peter Boyle
169f4b2711 Update README.md 2017-07-14 22:56:06 +01:00
Peter Boyle
2d8aff36fe Update README.md 2017-07-14 22:52:16 +01:00
Guido Cossu
9fa07eecde Merge branch 'develop' into feature/json-fix 2017-07-12 15:47:22 +01:00
azusayamaguchi
659d7d1a40 For test/solver
Fixed
2017-07-12 15:01:48 +01:00
Guido Cossu
f64fb7bd77 Fix gcc error on JSON compilation 2017-07-12 14:55:42 +01:00
Guido Cossu
2a35449b91 Merge branch 'develop' into feature/json-fix 2017-07-12 14:47:00 +01:00
Guido Cossu
184af5bd05 Added support for std::pair in the JSON serialiser 2017-07-12 14:44:53 +01:00
Guido Cossu
097c9637ee Fixed the JSON parsing error 2017-07-11 14:31:57 +01:00
azusayamaguchi
dc6f078246 fixed the header file for mpi3 2017-07-11 14:15:08 +01:00
Peter Boyle
8a4714a4a6 Update README.md 2017-07-09 00:11:54 +01:00
Peter Boyle
40e119c61c NUMA improvements worth preserving from AMD EPYC tests 2017-07-08 22:27:11 -04:00
Guido Cossu
d9593c4b81 Merge branch 'develop' into feature/json-fix 2017-07-07 14:17:50 +01:00
paboyle
ac740f73ce Works on Cori 2017-07-02 16:47:58 -07:00
paboyle
75dc7794b9 Working on Cori 2017-07-02 16:47:42 -07:00
paboyle
dee68fc728 IO working multiple nodes again. Strategy of all nodes writing metadata is unsafe.
Only one rank should do this. must identify this rank. Means pass communicator to the
Objects.
2017-07-02 23:33:48 +01:00
paboyle
a2d3643634 Merge branch 'feature/dwf-multirhs' of https://github.com/paboyle/Grid into feature/dwf-multirhs 2017-07-02 14:59:22 -07:00
paboyle
57002924bc NERSC shakeout of this 2017-07-02 14:58:30 -07:00
Peter Boyle
7b0237b081 Update README.md 2017-07-01 10:24:41 +01:00
Peter Boyle
b68ad0cc0b Update README.md 2017-07-01 10:20:07 +01:00
Peter Boyle
37263fd9b1 Update README.md 2017-07-01 10:06:24 +01:00
Peter Boyle
3d09e3e9e0 Update README.md 2017-07-01 10:05:46 +01:00
Peter Boyle
1354b46338 Update README.md 2017-07-01 10:04:32 +01:00
Peter Boyle
251a97fe1b Update README.md 2017-07-01 09:55:36 +01:00
Peter Boyle
e18929eaa0 Update README.md 2017-07-01 09:53:15 +01:00
Peter Boyle
f3b0a92e71 Update README.md 2017-07-01 09:48:00 +01:00
Peter Boyle
a0be3f7330 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-30 10:53:50 +01:00
Peter Boyle
b5a6e4f1fd Best option for Xeon cache blocking set 2017-06-30 10:53:22 +01:00
Peter Boyle
7a788db3dc Guard first touch 2017-06-30 10:49:08 +01:00
Peter Boyle
f20eceb6cd First touch once per page in a threaded loop 2017-06-30 10:48:27 +01:00
Peter Boyle
38325ebbc6 Interleave code path; not enabled 2017-06-30 10:23:51 +01:00
Peter Boyle
b73bd151bb Switch off counters by default 2017-06-30 10:16:35 +01:00
Peter Boyle
694b305cab Update to reporting 2017-06-30 10:16:13 +01:00
Peter Boyle
2d3737a133 O3, KNL 2017-06-30 10:15:59 +01:00
Peter Boyle
ac1f1838bc KNL only 2017-06-30 10:15:32 +01:00
Guido Cossu
09d09d0fe5 Update README.md 2017-06-29 11:48:11 +01:00
Guido Cossu
bf630a6821 README file update 2017-06-29 11:42:25 +01:00
Guido Cossu
8859a151cc Small corrections to the NEON port 2017-06-29 11:30:29 +01:00
Guido Cossu
688a39cfd9 Merge pull request #114 from nmeyer-ur/feature/arm-neon
ARM neon intrinsics support
Guido: checked and approved
2017-06-29 09:57:17 +01:00
paboyle
6f5a5cd9b3 Improved threaded comms benchmark 2017-06-28 23:27:02 +01:00
Nils Meyer
0933aeefd4 corrected Grid_neon.h 2017-06-28 20:22:22 +02:00
Peter Boyle
322f61acee Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-28 15:30:35 +01:00
Peter Boyle
08e04b9676 Better benchmarks 2017-06-28 15:30:06 +01:00
feaa2ac947 Merge branch 'feature/scalar-hmc-update' into develop 2017-06-28 12:46:18 +01:00
07de925127 minor scalar action fixes 2017-06-28 12:45:44 +01:00
Nils Meyer
a9c816a268 moved file to correct folder 2017-06-27 21:39:15 +02:00
Nils Meyer
e43a8b6b8a removed comments 2017-06-27 20:58:48 +02:00
Nils Meyer
bf729766dd removed collision with QPX implementation 2017-06-27 20:32:24 +02:00
Guido Cossu
dafb351d38 Merge pull request #120 from paboyle/feature/scalar-hmc-update
Scalar HMC update. 
I agree with the changes.
2017-06-27 16:23:14 +01:00
0b707b861c Merge branch 'develop' into feature/scalar-hmc-update 2017-06-27 14:40:05 +01:00
15e87a4607 HDF5 IO fix 2017-06-27 14:39:27 +01:00
7d7220cbd7 scalar: lambda/4! convention 2017-06-27 14:38:45 +01:00
Lanny91
7d2d5e8d3d Merge branch 'develop' of https://github.com/paboyle/Grid into feature/hadrons 2017-06-26 15:19:46 +01:00
paboyle
54e94360ad Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit 2017-06-24 23:10:24 +01:00
0af740dc15 minor scalar HMC code improvement 2017-06-24 23:04:05 +01:00
d2e8372df3 SU(N) algebra fix (was not working) 2017-06-24 23:03:39 +01:00
paboyle
869b99ec1e Threaded calls to multiple communicators 2017-06-24 10:55:54 +01:00
paboyle
4a29ab0d0a Merge branch 'feature/dwf-multirhs' of https://github.com/paboyle/Grid into feature/dwf-multirhs 2017-06-23 23:10:43 +01:00
paboyle
0165bcb58e Added an update to TODO list 2017-06-23 23:10:24 +01:00
Lanny91
deca1ecc50 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon 2017-06-23 19:35:19 +02:00
4372d04ad4 Merge pull request #118 from Lanny91/hotfix/bgq
Hotfix/bgq
2017-06-23 16:59:27 +01:00
paboyle
349d75e483 Precision fix 2017-06-23 02:57:59 -07:00
Lanny91
56abbdf4c2 AVX512 integer reduce fix (for non-intel compiler) 2017-06-23 11:09:14 +02:00
Lanny91
af71c63f4c AVX2 fix 2017-06-23 11:03:12 +02:00
paboyle
e51475703a Ticking off lots on the TODO list 2017-06-23 09:42:21 +01:00
paboyle
1feddf4ba6 const fixes 2017-06-22 19:32:41 +01:00
paboyle
600d7ddc2e Proof of concept : Multi RHS solver, running independent solves on different ranks 2017-06-22 18:54:34 +01:00
paboyle
e504260f3d Able to run a test job splitting into multiple MPI subdomains. 2017-06-22 18:53:11 +01:00
Lanny91
0440d4ce66 Merge branch 'develop' of https://github.com/paboyle/Grid into hotfix/bgq 2017-06-22 17:09:42 +02:00
Lanny91
08b0e472aa Fixed hadrons tests after merge 2017-06-22 16:34:33 +02:00
Lanny91
c11d69787e Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon
# Conflicts:
#	extras/Hadrons/Modules.hpp
#	extras/Hadrons/Modules/MFermion/GaugeProp.hpp
#	extras/Hadrons/modules.inc
#	tests/hadrons/Test_hadrons.hpp
#	tests/hadrons/Test_hadrons_meson_3pt.cc
2017-06-22 16:26:31 +02:00
Lanny91
dc6b2d30d2 Documentation fix 2017-06-22 16:09:45 +02:00
Lanny91
7a3bd5c66c Hadrons: new conserved current contraction test (for regression testing) 2017-06-22 16:06:15 +02:00
Lanny91
18211eb5b1 Hadrons: Fixed test to use new implementation of meson module. 2017-06-22 16:03:59 +02:00
Lanny91
863bb2ad10 Moving overly-specialised code out of Grid 2017-06-22 16:02:15 +02:00
paboyle
5e4bea8f20 Benchmark DWF works 2017-06-22 08:38:54 +01:00
paboyle
6ebf9f15b7 Splitting communicators first cut 2017-06-22 08:14:34 +01:00
paboyle
1d7aa673a4 Include BlockCG by default 2017-06-21 21:08:53 +01:00
paboyle
b9104f3072 Block CG 2017-06-21 21:08:03 +01:00
b22eab8c8b Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons 2017-06-21 18:32:48 +01:00
paboyle
a7d56523ab Merge branch 'feature/lanczos-simplify' into develop 2017-06-21 14:03:20 +01:00
paboyle
9e56c65730 Updated TODO list 2017-06-21 14:02:58 +01:00
paboyle
ef4f2b8c41 todo update 2017-06-21 09:22:20 +01:00
paboyle
e8b95bd35b Clean up finished. Could shrink Lanczos to around 400 lines at a push 2017-06-21 02:50:09 +01:00
paboyle
7e35286860 Simplified lanczos, added Eigen diagonalisation.
Curious if we can deprecate dependencly on BLAS.
Will see when we get 48^3 running on our BG/Q port
2017-06-21 02:26:03 +01:00
paboyle
0486ff8e79 Improved the lancos 2017-06-20 18:46:01 +01:00
1e8a2e1621 various compatibility fixes after merge 2017-06-20 17:24:55 +01:00
7587df831a Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi
e9cc21900f Block solver complete for staggered. Now stable on mass 0.003 and
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.

166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.

Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00
Azusa Yamaguchi
0a8faac271 Fix make tests compile 2017-06-19 22:54:18 +01:00
Azusa Yamaguchi
abc4de0fd2 No compile make tests fix 2017-06-19 22:03:03 +01:00
b672717096 Test_serialiation update for JSON 2017-06-19 14:38:39 +01:00
284ee194b1 JSON update 2017-06-19 14:38:15 +01:00
Azusa Yamaguchi
cfe3cd76d1 Block solver improvements 2017-06-19 14:04:21 +01:00
Azusa Yamaguchi
3fa5e3109f Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-19 14:01:44 +01:00
paboyle
8b7049f737 Improved detectino of usqcdInfo for plaq/linktr 2017-06-19 08:46:07 +01:00
paboyle
c85024683e Merge branch 'feature/parallelio' into develop 2017-06-19 01:39:48 +01:00
paboyle
1300b0b04b Update to enable multiple records per file more consistent with SciDAC.
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle
e6d984b484 ILDG tests 2017-06-18 00:13:22 +01:00
paboyle
1d18d95d4f Class name return 2017-06-18 00:13:03 +01:00
paboyle
ae39ec85a3 ComplexField defined 2017-06-18 00:12:48 +01:00
paboyle
b96daf53a0 Query tensor structures 2017-06-18 00:12:15 +01:00
paboyle
46879e1658 Complex defined in Impl even for gauge. 2017-06-18 00:11:45 +01:00
paboyle
ae4de94798 SciDAC I/O support 2017-06-18 00:11:23 +01:00
paboyle
0ab555b4f5 SciDAC I/O and ILDG improvements 2017-06-18 00:11:02 +01:00
paboyle
8e9be9f84f Updates for SciDAC IO 2017-06-18 00:10:42 +01:00
paboyle
d572170170 Update for SciDAC 2017-06-18 00:10:20 +01:00
81b18f843a Merge branch 'feature/scalar_adjointFT' into feature/hadrons
# Conflicts:
#	lib/qcd/action/scalar/ScalarImpl.h
2017-06-16 17:59:55 +01:00
Lanny91
1bd311ba9c Faster sequential conserved current implementation, now compatible with 5D vectorisation & G-parity. 2017-06-16 16:43:15 +01:00
Lanny91
41af8c12d7 Code cleaning for conserved current contractions. Will now be easier to implement mobius conserved current. 2017-06-16 16:38:59 +01:00
Lanny91
a833f88c32 Added missing SIMD integer reduction implementation for AVX, AVX-512, SSE4, IMCI 2017-06-16 15:58:47 +01:00
Lanny91
07b2c1b253 Placeholder precision change functions to allow Grid to compile with QPX (warning: no actual functionality) 2017-06-16 15:04:26 +01:00
Lanny91
735cbdb983 QPX Integer reduction (+ integer reduction test) 2017-06-14 10:55:10 +01:00
Lanny91
2ad54c5a02 QPX exchange support 2017-06-14 10:53:39 +01:00
paboyle
12ccc73cf5 Serialisation no compile fix 2017-06-14 05:19:17 +01:00
Nils Meyer
3d04dc33c6 ARM neon intrinsics support 2017-06-13 13:26:59 +02:00
paboyle
e7564f8330 Starting a test for reading an ILDG file. 2017-06-13 12:22:50 +01:00
paboyle
91199a8ea0 openmpi is not const safe 2017-06-13 12:21:29 +01:00
paboyle
0494feec98 Libz dependency 2017-06-13 12:00:23 +01:00
paboyle
a16b1e134e gcc 4.9 fix 2017-06-13 10:48:43 +01:00
James Harrison
20e92a7009 QedVFol: Allow output of scalar propagator and vacuum polarisation projected to arbitrary lattice momentum, not just zero-momentum. 2017-06-12 18:27:32 +01:00
Lanny91
5633a2db20 Faster implementation of conserved current site contraction. Added 5D vectorised support, but not G-parity. 2017-06-12 10:41:02 +01:00
Lanny91
2d433ba307 Changed header include guards to match new convention 2017-06-12 10:32:14 +01:00
paboyle
769ad578f5 Odd new error on G++ 49 on travis 2017-06-12 00:41:21 +01:00
paboyle
eaac0044b5 Compile fixes 2017-06-12 00:20:49 +01:00
paboyle
56042f002c New files 2017-06-11 23:19:20 +01:00
paboyle
3bfd1f13e6 I/O improvements 2017-06-11 23:14:10 +01:00
James Harrison
42f0afcbfa QedFVol: Output all scalar VP diagrams separately 2017-06-09 18:08:40 +01:00
Azusa Yamaguchi
70ab598c96 Move gfix into utils 2017-06-08 22:22:23 +01:00
Azusa Yamaguchi
1d0ca65e28 Move Gfix into utils 2017-06-08 22:21:50 +01:00
Azusa Yamaguchi
2bc4d0a20e Move code into utils 2017-06-08 22:21:25 +01:00
James Harrison
20ac13fdf3 QedFVol: add ChargedProp as an input to ScalarVP module, instead of calculating scalar propagator within ScalarVP. 2017-06-08 17:43:39 +01:00
2490816297 Hadrons: rare kaon program removed 2017-06-07 20:11:02 -05:00
5f55bca378 Hadrons: Quark module renamed MFermion::GaugeProp 2017-06-07 20:10:48 -05:00
James Harrison
e38612e6fa QedFVol: Update ScalarVP module for compatibility with new scalar action 2017-06-07 17:42:00 +01:00
James Harrison
c2b2b71c5d Merge branch 'feature/qed-fvol' of https://github.com/paboyle/Grid into feature/qed-fvol
# Conflicts:
#	extras/Hadrons/Modules.hpp
#	extras/Hadrons/modules.inc
2017-06-07 16:59:47 +01:00
James Harrison
009f48a904 QedFVol: Add missing factor of 2 in free vacuum polarisation 2017-06-07 16:34:09 +01:00
Lanny91
b8e45ae490 Fixed remaining fermion type aliases after merge. 2017-06-07 16:26:22 +01:00
Lanny91
b35fc4e7f9 Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon
# Conflicts:
#	extras/Hadrons/Global.hpp
#	tests/hadrons/Test_hadrons_rarekaon.cc
2017-06-07 14:38:51 +01:00
Lanny91
60f11bfd72 Removed redundant test module 2017-06-07 12:34:47 +01:00
f6aa82b7f2 Merge branch 'develop' into feature/hadrons 2017-06-06 11:46:33 -05:00
22749699a3 Fixes after merge and point sink module 2017-06-06 11:45:30 -05:00
Lanny91
8d442b502d Sequential current fix for spacial indices. 2017-06-06 17:06:40 +01:00
Lanny91
e5c8b7369e Boundary condition option in quark actions for hadrons tests. 2017-06-06 14:19:10 +01:00
0503c028be Merge branch 'feature/qed-fvol' into feature/hadrons (non-trivial conflicts on scalar Impl)
# Conflicts:
#	configure.ac
#	lib/qcd/action/scalar/Scalar.h
2017-06-05 16:37:47 -05:00
Lanny91
c504b4dbad Code cleaning 2017-06-05 15:56:43 +01:00
Lanny91
622a21bec6 Improvements to sequential conserved current test and small bugfix. 2017-06-05 15:55:32 +01:00
Lanny91
eec79e0a1e Ward Identity test improvements and conserved current bug fixes 2017-06-05 11:55:41 +01:00
paboyle
092dcd4e04 MPI I/O only if MPI compiled 2017-06-02 22:50:25 +01:00
Guido Cossu
4a8c4ccfba Test wilson flow, added maxTau for adaptive flow 2017-06-02 17:02:29 +01:00
Guido Cossu
9b44189d5a Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-02 16:56:00 +01:00
Guido Cossu
7da4856e8e Wilson flow with adaptive steps 2017-06-02 16:55:53 +01:00
Guido Cossu
aaf1e33a77 Adding adaptive integration in the WilsonFlow 2017-06-02 16:32:35 +01:00
paboyle
094c3d091a Improved and RNG's now survive checkpoint 2017-06-02 00:38:58 +01:00
Peter Boyle
4b98e524a0 Roll over to MPI version of I/O 2017-06-01 17:38:18 -04:00
Peter Boyle
1a1f6d55f9 Roll over to MPI IO for parallel IO 2017-06-01 17:37:26 -04:00
Peter Boyle
21421656ab Big changes improving the code to use MPI IO 2017-06-01 17:36:53 -04:00
Peter Boyle
6f687a67cd As local vols increase, use 64 bits for safety 2017-06-01 17:36:18 -04:00
paboyle
b30754e762 Merge branch 'feature/parallelio' of https://github.com/paboyle/Grid into feature/parallelio 2017-05-30 23:41:28 +01:00
paboyle
1e429a0d57 Added MPI version 2017-05-30 23:41:07 +01:00
paboyle
d38a4de36c Beginning move to MPI IO 2017-05-30 23:40:39 +01:00
paboyle
ef1b7db374 Diff comparison check 2017-05-30 23:40:11 +01:00
paboyle
53a9aeb965 Cosmetic only 2017-05-30 23:39:53 +01:00
paboyle
e30fa9f4b8 RankCount; need to clean up ambigious ProcessCount 2017-05-30 23:39:16 +01:00
paboyle
58e8d0a10d reverse direction lexico mapping 2017-05-30 23:38:30 +01:00
paboyle
62cf9cf638 Cleaner code 2017-05-30 23:38:02 +01:00
paboyle
0fb458879d Precision safe compile 2017-05-30 23:37:02 +01:00
Peter Boyle
725c513d94 Better MPI3 benchmarking 2017-05-29 16:47:32 -04:00
d8648307ff Merge branch 'develop' into feature/hadrons 2017-05-29 12:58:08 +01:00
064315c00b Hadrons: mesons gamma list fix 2017-05-29 12:57:33 +01:00
Guido Cossu
7c6cc85df6 Updating WilsonFlow test 2017-05-27 18:03:49 +01:00
Guido Cossu
a6691ef87c Merge pull request #110 from Lanny91/feature/hadrons
Hadrons: Fermion boundary conditions can now be set in measurement code.
2017-05-26 16:43:22 +01:00
Lanny91
23135aa58a Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon 2017-05-26 16:00:50 +01:00
Lanny91
8e0ced627a Hadrons: Fermion boundary conditions can now be set in measurement code. 2017-05-26 15:59:15 +01:00
Guido Cossu
0de314870d Faster derivative for WilsonGauge 2017-05-26 14:31:49 +01:00
Guido Cossu
ffb91e53d2 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-26 12:46:02 +01:00
Guido Cossu
f4e8bf2858 Fixing the topological charge. Wilson Flow tested, ok 2017-05-26 12:45:59 +01:00
a74c34315c Bootstrap script fix 2017-05-25 14:27:49 +01:00
paboyle
69470ccc10 Update to do list 2017-05-25 13:41:26 +01:00
paboyle
b8b5934193 Attempts to speed up the parallel IO 2017-05-25 13:32:24 +01:00
Guido Cossu
75856f2945 Compilation fix in the Tensor_exp 2017-05-25 12:44:56 +01:00
Guido Cossu
3c112a7a25 Small correction to the general exp definition 2017-05-25 12:09:00 +01:00
Guido Cossu
ab3596d4d3 Using Cayley-Hamilton form for the exponential of SU(3) matrices 2017-05-25 12:07:47 +01:00
paboyle
a8c10b1933 Use a global-X x Local-Y chunksize for parallel binary I/O.
Gives O(32 x 8 x 18*8*8) chunk size on configuration I/O.

At 150KB should be getting close to packet sizes and 4MB filesystem
block sizes that are reasonably (!?) performant. We shall see once I move
this off my laptop and over to BNL and time it.
2017-05-25 11:43:33 +01:00
Guido Cossu
15e801af3f Fixing a compilation error for generic SIMD 2017-05-19 16:39:36 +01:00
Guido Cossu
0ffc235741 Adding more statistics to the Benchmark_comms. Min and max 2017-05-19 10:55:04 +01:00
Guido Cossu
8e19c99c7d Adding more statistical info in the Benchmark_comms 2017-05-18 19:07:35 +01:00
Guido Cossu
a0bc0ad06f Reverting change in Bechmark_comms. Keeping 300 iterations 2017-05-18 17:48:11 +01:00
Guido Cossu
a8fb2835ca Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-18 14:45:00 +01:00
Guido Cossu
bc862ce3ab Fixing an allocation issue in Benchmark_comms 2017-05-18 14:44:56 +01:00
Lanny91
08b314fd0f Hadrons: conserved current test fixes. Axial current tests now also optional. 2017-05-18 13:16:14 +01:00
22f4feee7b Merge branch 'develop' into feature/scalar_adjointFT 2017-05-17 13:27:13 +02:00
3f858d6755 Scalar: phi^2 observable 2017-05-17 13:25:14 +02:00
paboyle
3267683e22 Union workaround for g++ 2017-05-17 11:26:18 +01:00
Azusa Yamaguchi
f46a67ffb3 No compile issue on clang on mac fixed.
Compiler version was clang++-3.9 under mpicxx
2017-05-17 10:51:01 +01:00
paboyle
f7b8383ef5 Half precisoin comms mixed prec test 2017-05-16 14:52:51 +01:00
Guido Cossu
10f2872aae Faster exponentiation for lattice fields 2017-05-15 15:51:16 +01:00
Lanny91
34332fe393 Improvement to sequential conserved current insertion tests 2017-05-12 16:30:43 +01:00
Lanny91
c2010f21ab Added sequential propagator test for gamma matrix insertion 2017-05-12 16:23:01 +01:00
Lanny91
98f610ce53 Reduced code duplication in hadron tests 2017-05-12 16:15:26 +01:00
Lanny91
d44cc204d1 Added test module for sequential gamma matrix insertion 2017-05-12 14:58:17 +01:00
35fa3d1dfd Merge branch 'master' into feature/scalar_adjointFT 2017-05-12 10:41:39 +01:00
paboyle
cd73897b8d Merge branch 'release/v0.7.0' into develop 2017-05-12 01:16:02 +01:00
paboyle
c4435e6beb Merge branch 'release/v0.7.0' 2017-05-12 01:15:59 +01:00
paboyle
7a8f6af5f8 Drop verbose compiler predefine check 2017-05-11 12:48:40 +01:00
paboyle
49a5d9bac7 Clang major, minor trailing underscore 2017-05-11 12:25:02 +01:00
paboyle
2b3fdd4a58 Print CXX predefines 2017-05-11 12:05:50 +01:00
paboyle
34502ec471 4.8 dropped as buggy. 2017-05-11 11:43:39 +01:00
paboyle
8a43e88b4f Compiler check early in build 2017-05-11 11:43:06 +01:00
d1ece74137 HMC scalar test: magnetisation measurement 2017-05-11 11:40:44 +01:00
paboyle
238df20370 Still working on the compiler compat checks 2017-05-11 11:30:14 +01:00
paboyle
97a32a6145 Add 4.8 test 2017-05-11 11:24:21 +01:00
paboyle
655492a443 Compiler detection 2017-05-11 11:21:11 +01:00
paboyle
1cab06f6bd Compat checks for compilers 2017-05-11 10:20:24 +01:00
43c817cc67 Scalar action: const fix 2017-05-11 00:07:17 +01:00
paboyle
f8024c262b Update Eigen 2017-05-10 13:30:09 +01:00
Guido Cossu
4cc5f01f4a Small change in the readme about the intel compiler 2017-05-09 15:38:59 +01:00
James Harrison
5cfc0180aa QedFVol: Output free VP along with charged VP. 2017-05-09 12:46:57 +01:00
James Harrison
914f180fa3 QedFVol: Implement exact O(alpha) vacuum polarisation. 2017-05-09 11:46:25 +01:00
Guido Cossu
9c12c37aaf Confirming the fix on the complex boundary conditions 2017-05-09 08:41:29 +01:00
Guido Cossu
806eaa0530 Adding back the IO tests in the list 2017-05-08 22:26:44 +01:00
Guido Cossu
01d0e54594 Merge branch 'release/v0.7.0' into develop 2017-05-08 22:02:51 +01:00
Guido Cossu
5aafa335fe Fixing JSON error for complex numbers 2017-05-08 21:56:44 +01:00
Guido Cossu
8ba0494485 Fixing JSON for complex numbers 2017-05-08 21:41:39 +01:00
Peter Boyle
d99d98d9fd Merge branch 'release/v0.7.0' of https://github.com/paboyle/Grid into release/v0.7.0 2017-05-08 15:08:20 -04:00
Peter Boyle
95a017a4ae Relax force constraints to pass in single precision. 2017-05-08 15:06:41 -04:00
paboyle
92f92379e6 Adding olivers test version 2017-05-08 18:42:19 +01:00
paboyle
529e78d43f Restart the v0.7.0 release 2017-05-08 18:20:04 +01:00
paboyle
4ec746d262 Merge branch 'release/v0.7.0' into develop 2017-05-06 18:43:03 +01:00
paboyle
51bf1501fc Merge branch 'release/v0.7.0' 2017-05-06 18:42:50 +01:00
paboyle
66d819c054 More info on gcc bug 2017-05-06 18:42:11 +01:00
paboyle
3f3686f869 formattign 2017-05-06 18:41:27 +01:00
paboyle
26bb829f8c Formatting 2017-05-06 18:40:55 +01:00
paboyle
67cb04fc66 README update 2017-05-06 18:39:54 +01:00
paboyle
a40bd68aed Version update 2017-05-06 17:00:14 +01:00
paboyle
36495e0fd2 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-06 16:39:27 +01:00
paboyle
93f6c15772 Warning squash 2017-05-06 16:38:58 +01:00
Peter Boyle
cb93eeff21 Update README 2017-05-06 16:28:12 +01:00
paboyle
c7cc7e6101 Fix 2017-05-06 16:10:09 +01:00
paboyle
c349aa6511 DEFINE warning elimination 2017-05-06 16:08:35 +01:00
paboyle
3bae0a2d5c Drop a gcc warning 2017-05-06 15:51:42 +01:00
paboyle
c1c7566089 GCC bug work around in 5.0 through 6.2 inclusive. 2017-05-06 15:20:25 +01:00
paboyle
2439999ec8 Warning elimination; drop to -O2 on G++ bad versions 2017-05-06 14:44:49 +01:00
paboyle
1d96f662e3 Fixed 4d fermion gparity force. Put strong tests on make check force tests 2017-05-06 00:46:31 +01:00
paboyle
41d1889941 trusty ubuntu 2017-05-05 21:25:35 +01:00
paboyle
0c3981e0c3 Trying to force recent automake 2017-05-05 21:15:22 +01:00
paboyle
c727bd4609 Trying to work around automake version 2017-05-05 21:00:00 +01:00
paboyle
db23749b67 Adding travis to make check 2017-05-05 20:42:08 +01:00
paboyle
751f2b9703 Better check and benchmark driving 2017-05-05 19:54:38 +01:00
Guido Cossu
741bc836f6 Exposing support for Ncolours and Ndimensions and JSON input file for the ScalarAction 2017-05-05 17:36:43 +01:00
James Harrison
6cb563a40c QedFVol: Access HVP tensor using a vector<vector<ScalarField>> instead of vector<vector<ScalarField*>> 2017-05-05 17:12:41 +01:00
paboyle
697c0603ce SITMO I/O for NERSC working now bit repro 2017-05-05 16:54:44 +01:00
paboyle
14bedebb11 Source pointed to 2017-05-05 16:17:27 +01:00
Guido Cossu
8546d01a4c Merge branch 'develop' into feature/scalar_adjointFT 2017-05-05 15:47:33 +01:00
paboyle
47b5c07ffb Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-05 14:27:02 +01:00
Guido Cossu
da86a2bf54 Merge branch 'feature/hmc_generalise' into develop 2017-05-05 14:23:02 +01:00
paboyle
c1cb60a0b3 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-05 14:22:37 +01:00
Guido Cossu
5ed5b4bfbf Merge branch 'develop' into feature/hmc_generalise 2017-05-05 14:22:33 +01:00
Guido Cossu
de84aacdfd Fixing a configure error for the smearing tests 2017-05-05 13:59:10 +01:00
paboyle
2888003765 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-05 13:02:24 +01:00
paboyle
da06bf5b95 Zmobius force test added 2017-05-05 12:52:45 +01:00
Guido Cossu
20999c1370 Merge branch 'develop' into feature/hmc_generalise 2017-05-05 12:47:17 +01:00
Lanny91
77e0af9c2e Compilation fix after merge - conserved current code not yet operational for vectorised 5D or Gparity Impl. 2017-05-05 12:27:50 +01:00
paboyle
33f0ed1a33 No compile fix 2017-05-05 11:04:30 +01:00
paboyle
50be56433b Delete old and defunct tests 2017-05-04 23:41:16 +01:00
paboyle
43924007db Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-04 19:53:41 +01:00
paboyle
78ef10e60f Mobius force improvement 2017-05-04 19:53:21 +01:00
Lanny91
ca1077c560 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon
# Conflicts:
#	lib/qcd/action/fermion/WilsonFermion5D.cc
#	tests/hadrons/Test_hadrons_rarekaon.cc
2017-05-04 16:22:33 +01:00
679ae98b14 Merge branch 'feature/better-external-library' into develop 2017-05-04 15:42:12 +01:00
paboyle
90f6bc16bb No compile clang fix 2017-05-04 12:15:06 +01:00
Peter Boyle
9b5b639546 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-05-03 20:51:40 -04:00
Peter Boyle
945767c6d8 More info 2017-05-03 20:26:35 -04:00
Peter Boyle
422cdf4979 Some checks 2017-05-03 18:37:38 -04:00
Peter Boyle
38db174f3b Print statement 2017-05-03 18:25:26 -04:00
Peter Boyle
92e364a35f Better reporting in benchmark for MPI3 2017-05-03 15:43:36 -04:00
James Harrison
db3837be22 QedFVol: Change “double” to “Real” in ScalarVP.cc 2017-05-03 13:26:49 +01:00
James Harrison
2f0dd83016 Calculate HVP using a single contraction of O(alpha) charged propagators. 2017-05-03 12:53:41 +01:00
58299b8ba2 Git info separated from version in git-config 2017-05-02 20:04:41 +01:00
124bf4d829 git ref in config summary 2017-05-02 19:41:01 +01:00
e8e56b3414 Config summary saved in git-config 2017-05-02 19:40:47 +01:00
89c430136d grid-config program 2017-05-02 19:13:13 +01:00
ea9aef7baa New header for standard headers (was an issue with Remez.h and external compilation) 2017-05-02 18:26:11 +01:00
c9e9e8061d Merge branch 'feature/hadrons' into develop 2017-05-02 18:23:47 +01:00
Guido Cossu
453cf2a1c6 Moving the topological charge outside the HMC related routines 2017-05-02 14:40:12 +01:00
Guido Cossu
de7bbfa5f9 Adding ParameterFile option for the HMC 2017-05-02 12:16:16 +01:00
dda8d77c87 Merge branch 'feature/hadrons' into feature/rare_kaon 2017-05-01 17:50:57 +01:00
aa29f4346a Hadrons: weird bus error with recent macOS clang 2017-05-01 17:49:08 +01:00
Guido Cossu
86116dbed6 Adding boundary condition switch (compile time) for the Mobius HMC example 2017-05-01 16:33:11 +01:00
Guido Cossu
7bd31e3f7c Adding external file support in the Mobius example (JSON) 2017-05-01 16:30:24 +01:00
Guido Cossu
74f451715f Fix for Mac compilation on the size_t uint64_t types 2017-05-01 15:12:07 +01:00
Guido Cossu
655be8ed76 Adding tests for the mobius operator 2017-05-01 14:42:16 +01:00
Guido Cossu
4063238943 Adding HMC test file example for Mobius + smearing 2017-05-01 13:44:00 +01:00
Guido Cossu
3344788fa1 Merge branch 'develop' into feature/hmc_generalise 2017-05-01 12:13:56 +01:00
Guido Cossu
62a64d9108 EO support, wip 2017-05-01 11:06:21 +01:00
Lanny91
49331a3e72 Minor improvements to Ward Identity checks 2017-04-28 16:50:17 +01:00
Lanny91
51d84ec057 Bugfixes in Wilson 5D sequential conserved current insertion 2017-04-28 16:49:14 +01:00
Lanny91
db14fb30df Hadrons: overhaul of conserved current test 2017-04-28 16:48:00 +01:00
Lanny91
b9356d3866 Added more complete test of sequential insertion of conserved current. 2017-04-28 16:46:40 +01:00
Guido Cossu
99a73f4287 Correcting the M and Mdag in the clover term 2017-04-28 15:51:05 +01:00
Lanny91
f302eea91e SitePropagator redefined to be a scalar object in TYPE_ALIASES. 2017-04-28 15:27:49 +01:00
Guido Cossu
5553b8d2b8 Clover term compiles, not tested 2017-04-28 15:23:34 +01:00
Lanny91
a6ccbbe108 Conserved current sequential source now registered properly and fixed module inputs. 2017-04-28 10:43:47 +01:00
James Harrison
3ac27e5596 QedFVol: remove unnecessary copies of free propagator from shifted sources in ScalarVP 2017-04-27 14:17:50 +01:00
Peter Boyle
99220f6531 Fixes and better timing 2017-04-26 17:24:11 -04:00
Lanny91
d2003f24f4 Corrected incorrect usage of ExtractSlice for conserved current code. 2017-04-26 17:25:28 +01:00
Lanny91
6299dd35f5 Hadrons: Added test of conserved current code. Tests Ward identities for conserved vector and partially conserved axial currents. 2017-04-26 12:41:39 +01:00
Lanny91
a39daecb62 Removed make_5D const declaration to avoid compilation error 2017-04-26 12:39:07 +01:00
Lanny91
159770e21b Legal Banners added 2017-04-26 09:32:57 +01:00
paboyle
2a6d093749 move the sudo: required to match locatoin on Guido's branch 2017-04-26 09:15:34 +01:00
paboyle
c947947fad sudo required suggested by guido 2017-04-26 08:45:36 +01:00
paboyle
f555b50547 Merge branch 'feature/half-prec-comms' into develop 2017-04-26 08:43:40 +01:00
paboyle
738c1a11c2 longer nloop 2017-04-26 08:43:20 +01:00
Peter Boyle
f8797e1e3e bug fix. works now and great face performance 2017-04-26 03:14:02 -04:00
Peter Boyle
fd1eb7de13 Clean implementation of the exterior faces listing only those points on the boudary 2017-04-26 02:34:52 -04:00
Peter Boyle
2ce898efa3 Pretty code 2017-04-26 02:34:25 -04:00
Lanny91
dc5a6404ea Hadrons: modules for testing conserved current contractions and sequential insertion. 2017-04-25 22:08:33 +01:00
Lanny91
44260643f6 First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions. 2017-04-25 18:00:24 +01:00
Lanny91
1425afc72f Rare Kaon test fix 2017-04-25 17:26:56 +01:00
James Harrison
bd466a55a8 QedFVol: remove charge dependence in chargedProp function of ScalarVP 2017-04-25 10:04:03 +01:00
paboyle
ab66bac4e6 Think I'm getting on top of the reduced cost exterior precomputed list of links 2017-04-25 08:50:26 +01:00
paboyle
56277a11c8 Build a list of whats on the surface 2017-04-24 17:06:15 +01:00
Guido Cossu
752048f410 Merge branch 'develop' into feature/clover 2017-04-24 14:41:20 +01:00
paboyle
916e9e1d3e Merge branch 'feature/half-prec-comms' of https://github.com/paboyle/Grid into feature/half-prec-comms 2017-04-24 10:39:19 +01:00
Peter Boyle
5b55867a7a Slightly cheaper Ext assembly 2017-04-24 05:36:11 -04:00
Peter Boyle
3accb1ef89 Debugged assemply split phase with interior suppression 2017-04-23 19:30:19 -04:00
Peter Boyle
e3d0e31525 Debugged assemply split phase with interior suppression 2017-04-23 19:29:27 -04:00
Peter Boyle
5812eb8a8c Partially fixed. But the comms-overlap does not work yet. 2017-04-22 18:50:25 -04:00
paboyle
4dd3763294 Use OMP as much as possible 2017-04-22 20:35:20 +01:00
paboyle
c429ace748 Cleaner OpenMP use 2017-04-22 20:28:42 +01:00
paboyle
ac58565d0a Dangerous rewrite of the assembly. If I make a mistake the debug will be painful. 2017-04-22 19:31:04 +01:00
paboyle
3703b718aa Mark up a table if a given site only receives from itself; including MPI3 splitting info. 2017-04-22 19:28:37 +01:00
paboyle
b722889234 Try a better load balancing loop 2017-04-22 19:27:41 +01:00
paboyle
abba44a837 Hand unrolled for overlapped comms 2017-04-22 17:45:17 +01:00
paboyle
f301be94ce Fixed 2017-04-22 17:42:31 +01:00
Peter Boyle
1d1b225497 Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). 2017-04-22 09:05:28 -04:00
Peter Boyle
53a785a3dd Fixing the KNL compile 2017-04-22 08:11:51 -04:00
paboyle
736bf3c866 Major rework of stencil. Half precision and MPI3 now working. 2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188 L1p config bg/q 2017-04-22 11:33:09 +01:00
paboyle
3844bcf800 If no f16c instructions supported must use software half precision conversion.
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
e1a2319d01 Simple compressor moved out of cshift into stencil 2017-04-20 13:18:15 +01:00
paboyle
180c732b4c Move compressors out of Cshift.
Slice iterators would help
2017-04-20 13:17:55 +01:00
paboyle
957a706d0b Useful script 2017-04-20 13:17:44 +01:00
paboyle
d2312e9874 Drop compressor entirely from Cshift to only Stencil. 2017-04-20 13:16:55 +01:00
paboyle
fc4ab9ccd5 Working half precision comms 2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca Massive compressor rework to support reduced precision comms 2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5 Type comparison in the traits work 2017-04-18 13:28:04 +01:00
paboyle
557c3fa109 Pretty change 2017-04-18 13:27:38 +01:00
paboyle
ec18e9f7f6 Merge branch 'develop' into feature/half-prec-comms 2017-04-18 11:39:39 +01:00
paboyle
a839d5bc55 Updated todo list 2017-04-18 11:22:17 +01:00
paboyle
de41b84c5c Merge branch 'feature/normHP' into develop 2017-04-18 10:57:21 +01:00
paboyle
8e161152e4 MultiRHS solver improvements with slice operations moved into lattice and sped up.
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
3141ebac10 MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled. 2017-04-17 10:50:19 +01:00
paboyle
7ede696126 Non compile of tests fixed 2017-04-16 23:40:00 +01:00
paboyle
bf516c3b81 higher precision reduction variables in norm and inner product 2017-04-15 12:27:28 +01:00
paboyle
441a52ee5d First cut at higher precision reduction 2017-04-15 10:57:21 +01:00
paboyle
a8db024c92 Cleaning up the dense matrix and lanczos sector 2017-04-15 08:54:11 +01:00
paboyle
a9c22d5f43 Verbose removal 2017-04-14 14:38:49 +01:00
paboyle
3ca41458a3 Fix to no USE_FP16 case 2017-04-14 14:20:54 +01:00
paboyle
9e2d29c644 USE_FP16 macro 2017-04-14 14:17:14 +01:00
Guido Cossu
b694996302 adding comments 2017-04-14 13:30:14 +01:00
Peter Boyle
951be75292 Half precision conversion working on AVX512 now too 2017-04-13 17:35:11 +01:00
James Harrison
c8e6f58e24 Fix typos in ScalarVP 2017-04-13 17:04:37 +01:00
Peter Boyle
b9113ed310 Patches for knl 2017-04-13 12:02:12 -04:00
James Harrison
888988ad37 Merge branch 'feature/qed-fvol' of https://github.com/paboyle/Grid into feature/qed-fvol
# Conflicts:
#	lib/qcd/action/fermion/Fermion.h
2017-04-13 15:54:40 +01:00
1407418755 Old qed-fvol program build disabled 2017-04-13 15:32:30 +01:00
a6a0da873f Merge branch 'feature/hadrons' into feature/qed-fvol 2017-04-13 15:31:06 +01:00
paboyle
42fb49d3fd Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-04-13 14:12:47 +01:00
paboyle
2a54c9aaab Merge branch 'feature/block-cg' into develop 2017-04-13 14:12:24 +01:00
paboyle
0957378679 Fixing conditional ugly way 2017-04-13 13:47:56 +01:00
paboyle
2ed6c76fc5 Getting multiline if then fi working 2017-04-13 13:43:13 +01:00
paboyle
d3b9a7fa14 F16c apparently requires AVX, even if the 128 bit are used.
Seems odd.
2017-04-13 13:19:11 +01:00
paboyle
75ea306ce9 Another try at travis 2017-04-13 13:05:32 +01:00
paboyle
4226c633c4 Default to FP16 off again 2017-04-13 12:51:39 +01:00
paboyle
5a4eafbf7e .travis 2017-04-13 12:50:43 +01:00
paboyle
eb8e26018b Travis update for macos 2017-04-13 12:35:11 +01:00
paboyle
db5ea001a3 Update to use Xcode 8.3 since -mfp16 causes SIGILL 2017-04-13 12:22:40 +01:00
paboyle
2846f079e5 Predicate tests on fp16 being enabled 2017-04-13 12:08:05 +01:00
paboyle
1d502e4ed6 FP16 optional compile time 2017-04-13 11:55:24 +01:00
paboyle
73cdf0fffe Drop f16c from SSE because of a macos compile error on travis 2017-04-13 11:23:41 +01:00
paboyle
1c25773319 Trap illegal instructions 2017-04-13 10:51:40 +01:00
paboyle
c38400b26f Trap signals 2017-04-13 10:35:20 +01:00
paboyle
9c3065b860 Debug flags off again 2017-04-13 10:01:32 +01:00
paboyle
94eb829d08 Align cast fixed for __mm128i gcc complained 2017-04-13 08:40:44 +01:00
paboyle
68392ddb5b Exchange in generic
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle
cb6b81ae82 Half precision conversion 2017-04-12 19:32:37 +01:00
Lanny91
c382c351a5 Quark test output correction. 2017-04-12 14:36:15 +01:00
Lanny91
af2d6ce2e0 Encapsulated 4D->5D and 5D->4D conversions in separate functions & added corresponding tests. 2017-04-12 14:36:02 +01:00
90ec6eda0c Rare K test solver name fix 2017-04-10 17:48:58 +01:00
Lanny91
ac1253bb76 Corrected solver in rare kaon test 2017-04-10 17:42:55 +01:00
fe8d625694 Merge commit '5e477ec553aa48d7d19b5a7c45d41acbb3392bcb' into feature/rare_kaon 2017-04-10 17:23:37 +01:00
53e76b41d2 Merge branch 'develop' into feature/hadrons 2017-04-10 17:00:53 +01:00
8ef4300412 spurious .dirstamp files removed 2017-04-10 17:00:22 +01:00
98a24ebf31 The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future. 2017-04-10 16:58:54 +01:00
James Harrison
e4a105a30b Merge branch 'feature/qed-fvol' of https://github.com/paboyle/Grid into feature/qed-fvol 2017-04-10 16:35:01 +01:00
James Harrison
26ebe41fef QedFVol: Implement charged propagator calculation within ScalarVP module 2017-04-10 16:33:54 +01:00
paboyle
b12dc89d26 Commenting and clean up 2017-04-10 20:38:20 +09:00
paboyle
d80d802f9d MultiRHS solver test 2017-04-10 00:12:12 +09:00
paboyle
3d99b09dba Start of blockCG 2017-04-09 23:42:10 +09:00
paboyle
db5f6d3ae3 Verbose fix 2017-04-09 23:41:30 +09:00
paboyle
683550f116 Const args improvement 2017-04-09 23:41:04 +09:00
Lanny91
5e477ec553 Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon 2017-04-07 11:51:09 +01:00
paboyle
55d0329624 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-04-07 11:08:14 +09:00
paboyle
86aaa35294 Christoph needs SchurDiagTwoKappa which is mobius specific. 2017-04-07 11:07:40 +09:00
Guido Cossu
363611ae21 Merge branch 'develop' into feature/clover 2017-04-05 16:26:04 +01:00
Guido Cossu
172d3dc93a Correcting names in tests 2017-04-05 16:24:04 +01:00
Guido Cossu
3b8a791e28 Merge branch 'develop' into feature/clover 2017-04-05 16:20:28 +01:00
Guido Cossu
7b03d8d087 Fixing the remaining merge conflicts 2017-04-05 16:17:46 +01:00
Guido Cossu
4b759b8f2a Merge branch 'feature/hmc_generalise' into feature/scalar_adjointFT 2017-04-05 14:50:28 +01:00
Guido Cossu
8c540333d5 Merge branch 'develop' into feature/hmc_generalise 2017-04-05 14:41:04 +01:00
Guido Cossu
6fd82228bf Working on the derivative 2017-04-05 10:51:44 +01:00
paboyle
5592f7b8c1 Creation mode better implementation 2017-04-05 02:35:34 +09:00
paboyle
35da4ece0b UID fix 2017-04-05 02:18:15 +09:00
paboyle
061b15b9e9 Merge branch 'feature/sitmo-skipahead' into develop 2017-04-05 01:24:49 +09:00
Guido Cossu
ca6efc685e Merge branch 'develop' into feature/clover 2017-04-04 10:19:02 +01:00
1e496fee74 Merge branch 'develop' into feature/qed-fvol
# Conflicts:
#	lib/qcd/action/fermion/Fermion.h
2017-04-03 19:02:57 +01:00
ff4e54ef80 Merge branch 'develop' into feature/hadrons 2017-04-03 18:56:21 +01:00
paboyle
561426f6eb Clean up 2017-04-02 23:13:48 +09:00
paboyle
83f6fab8fa Big/Small crush test, and fast SITMO rng init, faster but not ideal
MT and Ranlux init.
2017-04-02 12:10:51 +09:00
paboyle
0fade84ab2 No random device 2017-04-02 00:29:40 +09:00
paboyle
9dc7ca4c3b Sitmo fast init 2017-04-02 00:28:22 +09:00
paboyle
935d82f5b1 sanity checks 2017-04-02 00:27:28 +09:00
paboyle
9cbcdd65d7 No random device seed 2017-04-02 00:26:57 +09:00
paboyle
f18f5ed926 Drop random device 2017-04-02 00:26:26 +09:00
paboyle
d1d63a4f2d sitmo default 2017-04-02 00:26:05 +09:00
paboyle
7e5faa0f34 Multiple RNGs 2017-04-02 00:25:44 +09:00
paboyle
6af459cae4 Christoph's coefficients. 2017-03-31 17:07:43 +09:00
paboyle
1c4bc7ed38 Debugged staggered conventions 2017-03-31 14:41:48 +09:00
Lanny91
cd1bd921bd Reduced code duplication for Weak Hamiltonian contraction modules 2017-03-30 18:02:14 +01:00
Guido Cossu
b8ae787b5e Correcting a simple typo 2017-03-30 11:33:15 +01:00
Guido Cossu
fbe2c3b5f9 ]Merge branch 'develop' into feature/clover 2017-03-30 11:18:31 +01:00
Guido Cossu
1ed69816b9 First steps for the force term 2017-03-30 11:14:27 +01:00
Lanny91
fff5751b1a HADRONS: Updated rare kaon test program, including all contractions. Sink smearing still to be implemented. 2017-03-30 10:57:01 +01:00
Lanny91
2c81696fdd HADRONS: 4pt Weak + current disconnected topology (e.g. for rare neutral kaon decays) 2017-03-30 10:37:17 +01:00
Lanny91
c9dc22efa1 HADRONS: Standalone disconnected loop contraction. 2017-03-30 10:33:18 +01:00
Lanny91
0ab04a000f HADRONS: 3pt contraction with gamma insertion between two propagators. 2017-03-30 10:30:58 +01:00
paboyle
93ea5d9468 Pretty code 2017-03-30 15:00:03 +09:00
paboyle
1ec5d32369 Chulwoo's test to zmobius helped me shake out 2017-03-30 13:45:13 +09:00
paboyle
9fd23faadf Pretty layout 2017-03-30 13:44:45 +09:00
paboyle
10e4fa0dc8 Template instantiation improvements 2017-03-30 13:44:25 +09:00
paboyle
c4aca1dde4 Conjugate coefficients on adjoint 2017-03-30 13:44:05 +09:00
paboyle
b9e8ea3aaa conjugate coefficient on the dagger 2017-03-30 13:43:13 +09:00
paboyle
077aa728b9 Fix the ZMobius (I think) 2017-03-30 13:42:09 +09:00
paboyle
a8d83d886e Macro controls 2017-03-30 13:31:34 +09:00
paboyle
7fd46eeec4 Trailing whitespace removal 2017-03-30 13:31:10 +09:00
paboyle
e0c4eeb3ec Compiles again 2017-03-30 13:30:45 +09:00
paboyle
cb9a297a0a Chulwoo's Zmobius test 2017-03-30 13:30:25 +09:00
paboyle
2b115929dc Small AVX512 asm ifdef patch 2017-03-29 18:51:23 +09:00
paboyle
5c6571dab1 Merge branch 'feature/bgq-asm' into develop 2017-03-29 18:48:55 +09:00
paboyle
417ec56cca Release candidate 2017-03-29 05:45:33 -04:00
paboyle
756bc25008 Verbose header print by default 2017-03-29 04:44:17 -04:00
paboyle
35695ba57a Bug fix in MPI3 2017-03-29 04:43:55 -04:00
paboyle
81ead48850 Log any errors to a file 2017-03-29 04:39:52 -04:00
paboyle
d805867e02 Better init 2017-03-28 13:25:05 -04:00
paboyle
e55a751e23 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2017-03-28 12:20:12 -04:00
paboyle
358eb75995 Shorten loop 2017-03-28 12:20:02 -04:00
paboyle
98f9318279 Build on AVX2 and MPI passing with clang++ 2017-03-28 23:16:04 +09:00
paboyle
4b17e8eba8 Merge branch 'develop' into feature/bgq-asm
Conflicts:
	lib/qcd/action/fermion/Fermion.h
	lib/qcd/action/fermion/WilsonFermion.cc
	lib/util/Init.cc
	tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
paboyle
75112a632a IO improvements to fail on IO error 2017-03-28 02:28:04 -04:00
paboyle
18bde08d1b Merge branch 'feature/staggering' into develop 2017-03-28 15:25:55 +09:00
James Harrison
9f755e0379 Add functions momD1 and momD2 to ScalarVP 2017-03-27 16:49:18 +01:00
James Harrison
4512dbdf58 Rename module ScalarFV to ScalarVP 2017-03-27 15:02:16 +01:00
James Harrison
483fd3cfa1 Add propagator expansion terms as inputs to ScalarFV 2017-03-27 13:24:51 +01:00
Guido Cossu
3750b9ffee Deleting MPI test for OSX in Travis 2017-03-27 16:53:32 +09:00
Guido Cossu
5e549ebd8b Adding force terms 2017-03-27 16:43:15 +09:00
Guido Cossu
fff484eca5 Populating Clover fermions methods 2017-03-27 15:12:57 +09:00
Guido Cossu
5fdc05782b More in the clover fermion class 2017-03-27 10:54:16 +09:00
paboyle
d45cd7e677 Adding a simple read of NERSC test 2017-03-26 09:24:26 -04:00
paboyle
4e96679797 Added a bnl log 2017-03-25 09:25:46 -04:00
James Harrison
85516e9c7c Output all terms of scalar propagator separately 2017-03-24 17:13:55 +00:00
James Harrison
0c006fbfaa Add ScalarFV inputs to ScalarFV.hpp 2017-03-24 11:59:09 +00:00
James Harrison
54c10a42cc Add source and emField inputs to ScalarFV module 2017-03-24 11:42:32 +00:00
Guido Cossu
a04eb7df5d Starting Clover term 2017-03-24 12:43:28 +09:00
Guido Cossu
4c1ea8677e Small cosmetic changes and vscode gitignore 2017-03-23 14:09:35 +09:00
paboyle
fc93f0b2ec Save some code for static huge tlb's. It is ifdef'ed out but an interesting root only experiment.
No gain from it.
2017-03-21 22:30:29 -04:00
paboyle
8c8473998d Average over whole cluster the comm time. 2017-03-21 22:29:51 -04:00
James Harrison
ef0fe2bcc1 Added empty ScalarFV module 2017-03-21 11:39:46 +00:00
Guido Cossu
120fb59978 Adding tests for WilsonFlow classes 2017-03-21 16:11:35 +09:00
Guido Cossu
fd56b3ff38 Merge branch 'develop' into feature/hmc_generalise 2017-03-21 13:33:41 +09:00
Guido Cossu
0ec6829edc Fixing compilation errors for the WilsonFlow 2017-03-21 13:06:32 +09:00
Guido Cossu
18b7845b7b Adding WilsonFlow smearing 2017-03-21 11:52:05 +09:00
Guido Cossu
3d0fe15374 Added topological charge measurement 2017-03-17 16:14:57 +09:00
Guido Cossu
91886068fe Fixed seg fault for observable modules 2017-03-17 13:59:31 +09:00
Guido Cossu
6d1e9e5f92 Small cleanup of the observables 2017-03-17 11:42:55 +09:00
Guido Cossu
b640230b1e Moving hmc observables in a different directory 2017-03-17 11:40:17 +09:00
paboyle
e7c36771ed ZMobius prep for asm 2017-03-15 14:23:33 -04:00
Guido Cossu
038b6ee9cd Fixing JSON compilation error 2017-03-16 01:09:24 +09:00
Guido Cossu
38806343a8 Improving efficiency of the force term 2017-03-15 15:16:16 +09:00
Guido Cossu
831ca4e3bf Added Scalar action for fields in the adjoint representation 2017-03-14 14:55:18 +09:00
paboyle
8dc57a1e25 Layout change 2017-03-13 11:11:46 +00:00
paboyle
f57bd770b0 Merge branch 'bugfix/dminus' into feature/bgq-asm 2017-03-13 11:11:03 +00:00
paboyle
4ed10a3d06 Merge branch 'develop' into feature/bgq-asm 2017-03-13 11:10:10 +00:00
Peter Boyle
dfefc70b57 Merge pull request #93 from Lanny91/hotfix/qpx
Some fixes for QPX and generic SIMD types.
2017-03-13 09:31:26 +00:00
Chulwoo Jung
0b61f75c9e Adding ZMobius CG test 2017-03-13 00:12:43 -04:00
Chulwoo Jung
33edde245d Changing Dminus(Dag) to use full vectors to work correctly 2017-03-12 23:02:42 -04:00
paboyle
b64e004555 MPI run fail on macos 2017-03-13 01:59:01 +00:00
paboyle
447c5e6cd7 Z mobius hermiticity correction 2017-03-13 01:30:43 +00:00
paboyle
8b99d80d8c Merge branch 'bgq-asm-shmemfixes' into feature/bgq-asm 2017-03-12 23:30:09 +00:00
Guido Cossu
b3dede4dd3 Merge branch 'develop' into feature/hmc_generalise 2017-03-10 23:57:37 +09:00
Guido Cossu
4e34132f4d Correcting modules use in test files 2017-03-10 23:54:53 +09:00
Guido Cossu
c07cb10247 Merge branch 'feature/hmc_generalise' of https://github.com/paboyle/Grid into feature/hmc_generalise 2017-03-10 22:37:25 +09:00
Guido Cossu
d7767a2a62 Few more tests 2017-03-10 22:33:48 +09:00
Guido Cossu
ec035983fd Fixing the implicit integration 2017-03-01 11:56:35 +00:00
paboyle
3901b17ade timeings from BNL 2017-02-28 17:06:45 -05:00
paboyle
af230a1fb8 Average the time across the whole machine for outliers 2017-02-28 17:05:22 -05:00
Christopher Kelly
06a132e3f9 Fixes to SHMEM comms 2017-02-28 13:31:54 -08:00
Guido Cossu
596dcd85b2 Auxiliary fields 2017-02-27 13:16:38 +00:00
paboyle
96d44d5c55 Header fix 2017-02-24 19:12:11 -05:00
Guido Cossu
7270c6a150 Integrator works now 2017-02-24 17:03:42 +00:00
Lanny91
7fe797daf8 SIMD vector length sanity checks 2017-02-23 16:49:44 +00:00
Lanny91
486a01294a Corrected QPX SIMD width 2017-02-23 16:47:56 +00:00
paboyle
586a7c90b7 Merge branch 'develop' into feature/bgq-asm 2017-02-23 00:26:59 +00:00
paboyle
e099dcdae7 Merge branch 'develop' into feature/bgq-asm 2017-02-23 00:25:29 +00:00
paboyle
4e7ab3166f Refactoring header layout 2017-02-22 18:09:33 +00:00
paboyle
aac80cbb44 Bug fix from Chris K 2017-02-22 12:19:09 -05:00
Lanny91
c80948411b Added tRotate function and MaddRealPart struct for generic SIMD, bugfix in MultRealPart and minor cosmetic changes. 2017-02-22 14:57:10 +00:00
Lanny91
95625a7bd1 Use Grid Integer type 2017-02-22 13:09:32 +00:00
Lanny91
0796696733 Emulated integer vector type for QPX and generic SIMD instruction sets. 2017-02-22 12:01:36 +00:00
Peter Boyle
f8b9ad7d50 Merge pull request #91 from sunpho84/public_modules_memebers
making public same serializable parameters in HMC Module
2017-02-22 00:53:20 +00:00
Peter Boyle
04a1959895 Merge pull request #90 from sunpho84/liming
adding --with switch to pass lime path
2017-02-22 00:52:53 +00:00
Peter Boyle
cc773ae70c Merge pull request #89 from sunpho84/prepend_package_with_grid
Prepending PACKAGE_ with GRID_ in Config.h
2017-02-22 00:52:10 +00:00
Peter Boyle
d21c51b9be Merge pull request #88 from sunpho84/pickpoketting
now it is possible to pass {coords list} to a peek or poke
2017-02-22 00:51:33 +00:00
Peter Boyle
597a7b4b3a Merge pull request #81 from edbennett/develop
Fix misleading message: "doxygen-pdf requires doxygen-pdf"
2017-02-22 00:50:59 +00:00
azusayamaguchi
1c30e9a961 Verified 2017-02-21 23:01:25 +00:00
Francesco Sanfilippo
93cc270016 making public same serializable parameters in HMC Module
RNGModuleParameters
GridModuleParameters
2017-02-21 23:11:56 +01:00
Francesco Sanfilippo
29b60f7e1a adding --with switch to pass lime path 2017-02-21 23:09:39 +01:00
Francesco Sanfilippo
041884acf0 Prepending PACKAGE_ with GRID_ in Config.h
Avoid polluting linking progr
2017-02-21 22:51:36 +01:00
Francesco Sanfilippo
15e668eef1 now it is possible to pass {coords list} to a peek or poke 2017-02-21 22:48:38 +01:00
azusayamaguchi
bf7e3f20d4 Staggaered fermion optimised version 2017-02-21 14:35:42 +00:00
Guido Cossu
902afcfbaf Adding metric and the implicit steps 2017-02-21 11:30:57 +00:00
paboyle
3ae92fa2e6 Global changes to parallel_for structure.
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
paboyle
3906cd2149 Stencil fix on BNL KNL system 2017-02-20 17:51:31 -05:00
paboyle
5a1fb29db7 Useful debug code info to preserve 2017-02-20 17:49:23 -05:00
paboyle
661fc4d3d1 Debug AVX512 exchange code paths 2017-02-20 17:48:36 -05:00
paboyle
41009cc142 Move excange into the stencil only; keep Cshift fully general 2017-02-20 17:48:04 -05:00
paboyle
37720c4db7 Count bytes off node only 2017-02-20 17:47:40 -05:00
paboyle
1a30455a10 1000 iters on bmark for more accurate timing 2017-02-20 17:47:01 -05:00
Guido Cossu
97a6b61551 Covariant laplacian and implicit integration 2017-02-20 11:17:27 +00:00
paboyle
cd0da81196 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2017-02-16 18:52:30 -05:00
paboyle
f246fe3304 Improvements to avx for invertible to avoid latent bug 2017-02-16 23:52:44 +00:00
paboyle
8a29c16bde Faster gather exchange 2017-02-16 23:52:22 +00:00
paboyle
d68907fc3e Debug temp 2017-02-16 18:51:35 -05:00
paboyle
5c0adf7bf2 Make clang happy with parenthesis 2017-02-16 23:51:33 +00:00
paboyle
be3a8249c6 Faster gather 2017-02-16 23:51:15 +00:00
paboyle
bd600702cf Vectorise the XYZT face gathering better.
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
2017-02-15 11:11:04 +00:00
Lanny91
f011bdb869 Fixed overwrite of pminus projection in construction of 4d propagator from 5d. 2017-02-14 14:07:17 +00:00
Guido Cossu
bafb101e4f Testing different versions of the Laplacian 2017-02-13 15:38:11 +00:00
Guido Cossu
08fdf05528 Added and tested the covariant laplacian + CG solver 2017-02-13 15:05:01 +00:00
paboyle
aca7a3ef0a Optimisation control improvements 2017-02-10 18:22:31 -05:00
Guido Cossu
9e72a6b22e Reverting to Xcode 7.3 2017-02-10 12:57:03 +00:00
Guido Cossu
1c12c5612c Xcode 8.2 for travis 2017-02-10 12:12:01 +00:00
Guido Cossu
a8193c4bcb Correcting travis compilation on gcc 2017-02-10 10:59:30 +00:00
Guido Cossu
c3d7ec65fa All tests compile. 2017-02-10 10:27:51 +00:00
Guido Cossu
8b6a6c8236 Resolving small merge conflict 2017-02-09 16:20:24 +00:00
Guido Cossu
e0571c872b Merge branch 'develop' into feature/hmc_generalise 2017-02-09 16:12:00 +00:00
Guido Cossu
c67f41887b Reverting parameters to original 2017-02-09 15:59:56 +00:00
Guido Cossu
84687ccf1f Handling an Intel compiler warning for Json class 2017-02-09 15:33:33 +00:00
Guido Cossu
3274561cf8 Cleanup 2017-02-09 15:18:38 +00:00
e08fbb3771 Merge pull request #84 from Lanny91/feature/rare_kaon
Rare Kaon decay contraction code
2017-02-08 08:23:34 -08:00
Lanny91
d7464aa0fe Switched from XmlWriter to CorrWriter in contraction code 2017-02-08 16:13:44 +00:00
Lanny91
00d29153f0 Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon 2017-02-08 16:11:15 +00:00
2ce989f220 Hadrons: default I/O to HDF5 2017-02-08 07:50:05 -08:00
Lanny91
d7a1dc85be Revert "Hadrons: test for rare kaon contraction code."
This reverts commit 1e257a1251.
2017-02-08 13:23:05 +00:00
Lanny91
fc19503673 Removed MSink namespace. 2017-02-08 13:17:39 +00:00
Lanny91
beba824136 Make use of GammaL class in Weak Hamiltonian contractions 2017-02-08 12:45:39 +00:00
Lanny91
6ebf8b12b6 Removed unnecessary repeat of write in Weak Hamiltonian contractions 2017-02-08 12:43:13 +00:00
Lanny91
e5a7ed4362 Moved write outside of loop, some physics corrections 2017-02-08 12:29:33 +00:00
Lanny91
b9f7ea47c3 Access hasModule function directly from Environment instance. 2017-02-08 10:10:06 +00:00
Lanny91
06f7ee202e Revert "Add function to say whether or not a module exists in application class"
This reverts commit 522f6bf91a.
2017-02-08 10:08:18 +00:00
Lanny91
2b2fc6453f Fixed single precision compatibility issues 2017-02-07 13:59:29 +00:00
Lanny91
bdd2765461 Added missing allocation of Weak Hamiltonian result vector 2017-02-07 13:06:42 +00:00
paboyle
2c246551d0 Overlap comms and compute options in wilson kernels 2017-02-07 01:37:10 -05:00
paboyle
71ac2e7940 Faster RNG init 2017-02-07 01:33:23 -05:00
paboyle
2bf4688e83 Running on BNL KNL 2017-02-07 01:32:10 -05:00
paboyle
a48ee6f0f2 Don't use MPI3_leader any more. No real gain and complex 2017-02-07 01:31:24 -05:00
paboyle
73547cca66 MPI3 working i think 2017-02-07 01:30:02 -05:00
paboyle
123c673db7 Policy to control async or sync SendRecv 2017-02-07 01:24:54 -05:00
paboyle
61f82216e2 Communicator Policy, NodeCount distinct from Rank count 2017-02-07 01:22:53 -05:00
paboyle
8e7ca92278 Debugged cshift case 2017-02-07 01:21:32 -05:00
paboyle
485ad6fde0 Stencil working in SHM MPI3 2017-02-07 01:20:39 -05:00
paboyle
6ea2184e18 OMP define change 2017-02-07 01:17:16 -05:00
paboyle
fdc170b8a3 Parallel fors in lattice transfer 2017-02-07 01:16:39 -05:00
paboyle
060da786e9 Comms benchmark improvements 2017-02-07 01:07:39 -05:00
paboyle
85c7bc4321 Bug fixes for cases that physics code couldn't hit but latent
and discovered on KNL (long vector, y SIMD dir) and checker dir set to y.
Remove the assertions on these code paths now they are tested.
2017-02-07 01:01:15 -05:00
paboyle
0883d6a7ce Overlap comms compute support; make reg naming consistent with bgq aasm 2017-02-07 00:59:32 -05:00
paboyle
9ff97b4711 Improved stencil tests passing all on KNL multinode 2017-02-07 00:58:34 -05:00
paboyle
b5e9c900a4 Better printing and signal handling options 2017-02-07 00:57:55 -05:00
paboyle
4bbdfb434c Overlap comms compute modifications 2017-02-07 00:57:01 -05:00
Lanny91
4a45c06dd7 Code cleaning and addition of Weak Hamiltonian contraction log message 2017-02-06 20:12:30 +00:00
Lanny91
d6a7d7d1e0 Hadrons: added missing momentum parameter in rare kaon contraction test 2017-02-06 18:15:49 +00:00
Lanny91
1a122a0dd8 Hadrons: corrected gamma matrix inputs in rare kaon test 2017-02-06 17:35:41 +00:00
Lanny91
20e20733e8 Merge branch 'feature/hadrons' into feature/rare_kaon 2017-02-06 14:12:21 +00:00
Lanny91
b7cd1a19e3 Utilities for reading and writing "pair" objects. 2017-02-06 14:08:59 +00:00
Lanny91
f510002a62 Merge remote-tracking branch 'paboyle/feature/hadrons' into feature/hadrons 2017-02-03 14:37:34 +00:00
Christopher Kelly
c94133af49 Added iteration reporting to CG and mixed CG
Added ability to manually change the initial CG inner tolerance in mixed CG
Added .hpp files to filelist script
2017-02-02 17:04:42 -05:00
eedcaf6470 Merge branch 'feature/hadrons' into feature/qed-fvol 2017-02-01 15:53:10 -08:00
e7d8030a64 operator>> for serialisable enums 2017-02-01 15:51:08 -08:00
d775fbb2f9 Gammas: code cleaning and gamma_L implementation & test 2017-02-01 15:45:05 -08:00
863855f46f header fix 2017-02-01 11:59:44 -08:00
419af7610d New gamma matrices tidying: generated code is confined to Gamma.* for readability 2017-02-01 11:23:12 -08:00
Lanny91
1e257a1251 Hadrons: test for rare kaon contraction code. 2017-02-01 16:36:40 +00:00
Lanny91
522f6bf91a Add function to say whether or not a module exists in application class 2017-02-01 16:36:08 +00:00
Lanny91
d35d87d2c2 Weak Hamiltonian Eye-type contraction execution 2017-02-01 16:33:24 +00:00
Lanny91
74a5cda84b Removed unnecessary "3pt" labels 2017-02-01 15:03:49 +00:00
Lanny91
5be05d85b8 Fixed collision of Wall source and sink header ifdefs 2017-02-01 13:56:22 +00:00
Lanny91
35ac85aea8 Updated Weak Hamiltonian contractions to use zero-flop gamma matrices 2017-02-01 12:57:34 +00:00
Lanny91
fa237401ff Consistent variable name in macro 2017-02-01 12:56:55 +00:00
Lanny91
97053adcb5 Merge branch 'feature/hadrons' into feature/rare_kaon 2017-02-01 10:13:29 +00:00
Lanny91
f8fbe4d7a3 Merge remote-tracking branch 'paboyle/feature/hadrons' into feature/hadrons
# Conflicts:
#	extras/Hadrons/Modules/MContraction/Meson.hpp
#	tests/hadrons/Test_hadrons_meson_3pt.cc

Updated Meson.hpp to utilise zero-flop gamma matrices.
2017-02-01 09:27:00 +00:00
Lanny91
ef31c012bf Merge remote-tracking branch 'paboyle/develop' into feature/hadrons 2017-01-31 17:36:10 +00:00
7da7d263c4 typo 2017-01-30 10:53:13 -08:00
1140573027 Gamma adj fix: now in Grid namespace to avoid collisions 2017-01-30 10:53:04 -08:00
Lanny91
9e9f621d5d Hadrons: added Weak Hamiltonian module dependencies, some reformatting. 2017-01-30 17:54:21 +00:00
Lanny91
651e1a7cbc Hadrons: Momentum inserted as multiples of 2*pi/L 2017-01-30 17:14:33 +00:00
a0cfbb6e88 Merge branch 'feature/gammas' into feature/hadrons
# Conflicts:
#	.gitignore
#	lib/qcd/spin/Dirac.cc
#	scripts/filelist
2017-01-30 09:10:49 -08:00
Lanny91
c4d3672720 Hadrons: Momentum projection in meson module. 2017-01-30 17:09:04 +00:00
515a26b3c6 gammas: copyright update 2017-01-30 09:07:09 -08:00
Guido Cossu
16be6d378c Now action factory support different Fields (templated) 2017-01-30 14:22:41 +00:00
Guido Cossu
f05d0565aa Adding ScalarField theory 2017-01-30 10:59:28 +00:00
b39f0d1fb6 Hadrons: default I/O to HDF5 if possible, XML otherwise 2017-01-27 18:12:35 -08:00
9f1267dfe6 Merge branch 'feature/qed-fvol' of github.com:paboyle/Grid into feature/qed-fvol 2017-01-27 17:06:34 -08:00
2e90285232 Merge pull request #80 from jch1g10/feature/qed-fvol
ChargedProp: remove ScalarField fs
2017-01-27 17:06:13 -08:00
e254de982e Merge branch 'feature/qed-fvol' of github.com:paboyle/Grid into feature/qed-fvol 2017-01-27 17:02:35 -08:00
28d99b5297 Merge branch 'develop' into feature/qed-fvol 2017-01-27 16:59:53 -08:00
c946d3bf3f Merge branch 'develop' of github.com:edbennett/Grid into develop 2017-01-27 22:12:11 +00:00
1c68098780 fix misleading message: "doxygen-pdf requires doxygen-pdf" 2017-01-27 22:04:26 +00:00
Lanny91
9bf4108d1f Weak Hamiltonian contraction modules, for Eye and NonEye contraction topologies. Execution for NonEye type diagrams has been implemented, but not yet for Eye type. 2017-01-27 16:58:11 +00:00
Guido Cossu
899e685627 Merge branch 'feature/sitmo_rng' into develop 2017-01-27 14:15:56 +00:00
James Harrison
ee93f0218b ChargedProp: remove ScalarField fs 2017-01-27 12:22:48 +00:00
Guido Cossu
6929a84c70 Reformatting files 2017-01-27 11:54:44 +00:00
Guido Cossu
5c779a789b Moving registrations in an independent file 2017-01-27 11:23:51 +00:00
161ed102a5 Merge pull request #79 from jch1g10/feature/qed-fvol
Fixed bug in ChargedProp
2017-01-26 19:49:14 -08:00
3bf993d81a gitignore update 2017-01-26 17:00:59 -08:00
fad743fbb1 Build system sanity check: corrected several headers not in the <Grid/*> format 2017-01-26 17:00:41 -08:00
Guido Cossu
e863a948e3 Cleaning up files and directories 2017-01-26 15:24:49 +00:00
James Harrison
f65a585236 ChargedProp: Switch to HDF5 output 2017-01-26 15:02:30 +00:00
Lanny91
977f34dca6 Added missing typename 2017-01-26 13:18:33 +00:00
Lanny91
90ad956340 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon 2017-01-26 12:08:41 +00:00
Guido Cossu
7996f06335 Commented out registrations.
Move to an independent file that is linked only for the factory managed HMC
2017-01-25 18:27:45 +00:00
Guido Cossu
ef8d3831eb Temporary patch the threading error in InsertSlice and ExtractSlice
Find source and fix the error
2017-01-25 18:12:04 +00:00
Guido Cossu
70ed9fc40c Updating the engine to the last version 2017-01-25 18:10:41 +00:00
Guido Cossu
7b40a3e3e5 Reorganizing files 2017-01-25 18:09:46 +00:00
4d3787db65 Hadrons fixed for new gammas, Meson only does one contraction but this’ll change in the future 2017-01-25 09:59:00 -08:00
Guido Cossu
677757cfeb Added and tested SITMO PRNG 2017-01-25 12:47:22 +00:00
Guido Cossu
f7fbbaaca3 Compiles after merging 2017-01-25 12:11:58 +00:00
Guido Cossu
17629b8d9e Merge branch 'develop' into feature/hmc_generalise 2017-01-25 11:33:53 +00:00
Guido Cossu
0baa20d292 Againg fixing compilation on Travis, no LIME lib present 2017-01-25 11:18:44 +00:00
Guido Cossu
4571c918a4 Fixing compilation error when compiling without LIME 2017-01-25 11:14:43 +00:00
Guido Cossu
5251ea4d30 Adding more fermion action modules, generalised DWF 2017-01-25 11:10:44 +00:00
05cb6d318a gammas: adjoint implemented as a symbolic operation 2017-01-24 18:07:43 -08:00
0432e30256 Gamma right multiply code fix (now passes consistency check) 2017-01-24 17:36:23 -08:00
2c3ebc1e07 .gitignore update 2017-01-24 17:35:42 -08:00
068b28af2d Extensive gamma test program 2017-01-24 17:35:29 -08:00
f7db342f49 Serialisable enums can be converted to int 2017-01-24 17:33:26 -08:00
d65e81518f Merge branch 'feature/hadrons' into develop 2017-01-24 09:21:44 -08:00
Guido Cossu
7f456b4173 👷 Added all pseudofermion actions to the serialiser 2017-01-24 13:57:32 +00:00
a37e71f362 New automatic implementation of gamma matrices, Meson and SeqGamma are broken 2017-01-23 19:13:43 -08:00
James Harrison
ae99e99da2 Fixed bug in ChargedProp 2017-01-23 17:27:50 +00:00
Lanny91
c291ef77b5 Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/hadrons 2017-01-23 15:24:47 +00:00
Lanny91
7dd2764bb2 Wall sink smearing 2017-01-23 15:17:54 +00:00
Guido Cossu
244f8fb6dc Added JSON parser (without NextElement) 2017-01-23 14:57:38 +00:00
azusayamaguchi
05c1924819 Timing loop change 2017-01-23 10:43:45 +00:00
f3ca29af6c Merge branch 'feature/hadrons' into feature/qed-fvol 2017-01-21 13:41:05 -08:00
b7da264b0a Hadrons: Application is not storing the environment ref but calling getInstance() each time, solving a very nasty set fault on Linux/KNL 2017-01-21 13:40:23 -08:00
37988221a8 Merge branch 'feature/serialisation-hdf5' into feature/qed-fvol 2017-01-20 14:04:20 -08:00
74ac2aa676 Merge branch 'feature/serialisation-hdf5' into feature/hadrons 2017-01-20 14:03:51 -08:00
4c75095c61 HDF5: header fix 2017-01-20 12:14:01 -08:00
afa095d33d HDF5: better complex number support 2017-01-20 12:10:41 -08:00
6b5259cc10 HDF5 detects if a name is a dataset or not without using exception catching 2017-01-20 11:03:19 -08:00
Guido Cossu
27dfe816fa Added TwoFlavorsEO
Had to remove a conformability check in the Derivative of SchurDiff,
see the comments in the file
2017-01-20 16:59:31 +00:00
Lanny91
af29be2c90 Simplified operation of meson module. Result has been modified to output one contraction at a time for each pair of gamma insertions at source and sink. 2017-01-20 16:38:50 +00:00
Guido Cossu
f96fac0aee All functionalities ready.
Todo: add all the fermion action modules
2017-01-20 12:56:20 +00:00
7423a352c5 HDF5: typos 2017-01-19 18:33:04 -08:00
81e66d6631 HDF5: revert back to native types 2017-01-19 18:24:53 -08:00
ade1058e5f Hdf5Type does not need to be a pointer anymore 2017-01-19 18:23:55 -08:00
6eea9e4da7 HDF5 types static initialisation is mysteriously buggy on BG/Q, changing strategy 2017-01-19 18:02:53 -08:00
2c673666da Standardisation of HDF5 types 2017-01-19 17:19:12 -08:00
7a327a3f28 Merge branch 'develop' into feature/qed-fvol 2017-01-19 14:22:36 -08:00
Lanny91
07f2ebea1b Meson module now takes list of gamma matrices to insert at source and sink. 2017-01-19 22:18:42 +00:00
d6401e6d2c Merge branch 'feature/hadrons' into develop 2017-01-19 14:10:01 -08:00
24d3d31b01 Genetic scheduler: uses insert instead of emplace for better compiler compatibility 2017-01-19 14:08:22 -08:00
Guido Cossu
851f2ad8ef Adding fermions actions support in the factories 2017-01-19 10:00:02 +00:00
5405526424 Code typo 2017-01-18 22:42:19 -08:00
f3f0b6fef9 serious rewriting of Test_serialisation, now crashes if IO inconsistent 2017-01-18 17:41:05 -08:00
654e0b0fd0 Serialisable object are now comparable with == 2017-01-18 17:40:32 -08:00
4be08ebccc debug code cleaning 2017-01-18 17:39:59 -08:00
f599cb5b17 HDF5 serial IO implemented and tested 2017-01-18 16:50:21 -08:00
Guido Cossu
23e0561dd6 Added all required functionalities, time for cleaning
All actions to be added
2017-01-18 16:31:51 +00:00
a4a509497a Merge branch 'develop' of github.com:paboyle/Grid into develop 2017-01-17 16:22:22 -08:00
5803933aea First implementation of HDF5 serial IO writer, reader is still empty 2017-01-17 16:21:18 -08:00
Lanny91
8ae1a95ec6 Legal banners and module descriptions 2017-01-17 18:14:20 +00:00
Lanny91
82b7d4eaf0 Added noise loop dependencies 2017-01-17 15:58:32 +00:00
Lanny91
78774fbdc0 Construct loop propagator 2017-01-17 15:29:45 +00:00
Guido Cossu
924130833e Moved more parameters to serialization 2017-01-17 13:22:18 +00:00
Guido Cossu
7cf833dfe9 Fixed compilation error in tests hadrons (capital letter in dir name) 2017-01-17 11:00:54 +00:00
Guido Cossu
0157274762 HMC factories 2017-01-17 10:46:49 +00:00
Guido Cossu
87e8aad5a0 Added support for input file HMC modules (missing the actions yet) 2017-01-16 16:07:12 +00:00
Guido Cossu
c6f59c2933 Adding factories 2017-01-16 10:18:09 +00:00
91a3534054 Lattice slice utilities now thread safe 2017-01-16 06:32:25 +00:00
16a8e3d0d4 gitignore update for ST3 2017-01-16 06:32:05 +00:00
Lanny91
b7f90aa011 Added momentum choice for wall source 2017-01-13 15:54:19 +00:00
92f8950a56 Charged scalar prop: cleaning and output 2017-01-13 13:30:56 +00:00
65987a8a58 First implementation of the scalar QED propagator, runs but absolutely not checked 2017-01-12 20:44:23 +00:00
889d828bc2 Code cleaning 2017-01-12 18:17:44 +00:00
Lanny91
f22b79da8f Added missing type aliases 2017-01-12 12:52:12 +00:00
Lanny91
3855673ebf Added header for wall source 2017-01-12 11:42:37 +00:00
Lanny91
4db82da0db Wall sources 2017-01-12 11:41:10 +00:00
Lanny91
0cdc3d2fa5 Merge remote-tracking branch 'refs/remotes/paboyle/feature/hadrons' into feature/hadrons 2017-01-12 11:26:55 +00:00
ad98b6193d creating the necessary caches for the FFT EM scalar propagator 2017-01-11 18:40:43 +00:00
fc760016b3 More uniform cache name for scalar momentum propagators 2017-01-11 18:39:58 +00:00
2da86f7dae Merge branch 'feature/hadrons' into feature/qed-fvol 2017-01-11 18:38:05 +00:00
41df1db811 Hadrons: number of dimensions entirely determined by the initial grid 2017-01-11 18:37:49 +00:00
Guido Cossu
0dfda4bb90 Working on the RNGModule 2017-01-09 11:06:18 +00:00
Guido Cossu
1189ebc8b5 Cleaning up the checkpointers interface 2017-01-05 15:52:52 +00:00
97843e2b58 Hadrons: free scalar buffer fix and output 2017-01-05 14:58:55 +00:00
82b3f54697 scalar free propagator fix 2017-01-05 14:58:07 +00:00
Guido Cossu
1bb8578173 Added module for checkpointers 2017-01-05 13:09:32 +00:00
Peter Boyle
c3b6d573b9 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2016-12-30 22:42:17 +00:00
673994b281 Hadrons: modules for scalar propagators 2016-12-29 22:44:58 +01:00
bbc0eff078 Hadrons: scalar sources 2016-12-29 22:44:22 +01:00
4c60e31070 Hadrons: code cleaning 2016-12-29 22:44:08 +01:00
afbf7d4c37 QED Gimpl moved in Photon.h 2016-12-29 22:43:38 +01:00
8c3cc32364 Scalar action 2016-12-29 22:42:58 +01:00
Peter Boyle
1e179c903d Worried about integer; suspect where statements are broken 2016-12-27 17:46:38 +00:00
Peter Boyle
669cfca9b7 No inline 2016-12-27 17:45:40 +00:00
Peter Boyle
ff2f559a57 Remove inline on gather optimised path 2016-12-27 17:45:19 +00:00
Peter Boyle
03c81bd902 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2016-12-27 11:25:35 +00:00
Peter Boyle
a869addef1 Stats switch off 2016-12-27 11:25:22 +00:00
Peter Boyle
1caa3fbc2d LOCK UNLOCK only 2016-12-27 11:24:45 +00:00
Peter Boyle
3d21297bbb Call the fast path compressor for wilson kernels to avoid if else on projector 2016-12-27 11:23:13 +00:00
Peter Boyle
25efefc5b4 Back to original thread policy post test 2016-12-23 09:49:04 +00:00
Peter Boyle
eabf316ed9 BGQ performance ASM 2016-12-22 21:56:08 +00:00
Peter Boyle
04ae7929a3 BGQ or KNL assembler now 2016-12-22 17:53:22 +00:00
Peter Boyle
caba0d42a5 L1p controls 2016-12-22 17:52:55 +00:00
Peter Boyle
9ae81c06d2 L1p controls for BG/Q 2016-12-22 17:52:21 +00:00
Peter Boyle
0903c48caa Hot start SU3 2016-12-22 17:51:45 +00:00
Peter Boyle
7dc36628a1 QPX finishing 2016-12-22 17:50:48 +00:00
Peter Boyle
b8cdb3e90a Debug hack; raises from 62GF/s to 72 GF/s per node on BG/Q 2016-12-22 17:50:14 +00:00
Peter Boyle
5241245534 Default to static scheduling 2016-12-22 17:49:21 +00:00
Dr Peter Boyle
960316e207 type conversion in printf 2016-12-22 17:27:01 +00:00
Guido Cossu
5214846341 Adding a resource manager 2016-12-22 12:41:56 +00:00
4c3fd9fa3f stochastic QED field module in Hadrons 2016-12-22 00:29:41 +01:00
17b3a10d46 stochastic QED: function to cache 1/sqrt(khat^2) 2016-12-22 00:29:19 +01:00
149a46b92c Merge branch 'feature/hadrons' into feature/qed-fvol 2016-12-22 00:26:43 +01:00
3215ae6b7e Hadrons: genetic scheduler crashes in multi-thread with 1 module, multi-threading deactivated for now 2016-12-22 00:26:30 +01:00
7a85fddc7e Hadrons: modification of registration mechanism to allow for persistent caches 2016-12-22 00:25:36 +01:00
Guido Cossu
ce1a115e0b Removing redundant arguments for integrator functions, step 1 2016-12-20 17:51:30 +00:00
db9c28a773 qed-fvol: Photon parameter name fix 2016-12-20 12:41:39 +01:00
9ac3ac41df serialisable Photon parameters 2016-12-20 12:41:01 +01:00
2af9ab9034 old Makefile cleaning 2016-12-20 12:40:26 +01:00
6f1ea96293 Merge branch 'develop' into feature/qed-fvol 2016-12-20 12:33:02 +01:00
f8d11ff673 better serialisable enums (can be encapsulated into classes) 2016-12-20 12:31:49 +01:00
paboyle
3f2d53a994 BGQ assembler beginning 2016-12-20 10:21:26 +00:00
paboyle
8a337f3070 Move cayley into mainstream tests 2016-12-18 02:35:31 +00:00
paboyle
a59f5374d7 Evade warning 2016-12-18 02:23:55 +00:00
paboyle
4b220972ac Warning fix 2016-12-18 02:14:17 +00:00
paboyle
629f43e36c Return statement needed 2016-12-18 02:09:37 +00:00
paboyle
a3172b3455 Precision error 2016-12-18 02:07:45 +00:00
paboyle
3e6945cd65 Fixing AVX Z-mobius 2016-12-18 02:05:11 +00:00
paboyle
87be03006a AVX 512 code broke other compiles; fixing 2016-12-18 01:45:09 +00:00
paboyle
f17436fec2 Bad commit fixed 2016-12-18 01:27:34 +00:00
Peter Boyle
4d8b01b7ed Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-12-18 00:56:57 +00:00
Peter Boyle
fa6acccf55 Zmobius asm 2016-12-18 00:56:19 +00:00
Peter Boyle
55cb22ad67 Z mobius bmark 2016-12-18 00:55:37 +00:00
azusayamaguchi
df9108154d Debugged 2 versions of assembler; ls vectorised, xyzt vectorised 2016-12-17 23:47:51 +00:00
azusayamaguchi
b3e7f600da Partial implementation of 4d vectorisation assembler 2016-12-16 23:50:30 +00:00
azusayamaguchi
d4071daf2a Template specialise 2016-12-16 22:28:29 +00:00
azusayamaguchi
a2a6329094 AVX512 only for ASM compilation 2016-12-16 22:03:29 +00:00
azusayamaguchi
eabc577940 Assembler possibly working 2016-12-16 16:55:36 +00:00
2e3c5890b6 qed-fvol: build fix 2016-12-15 20:06:46 +00:00
bc6678732f Merge branch 'feature/hadrons' into feature/qed-fvol
# Conflicts:
#	Makefile.am
#	configure.ac
#	lib/qcd/action/gauge/Photon.h
2016-12-15 19:53:00 +00:00
b10ae00c8a Merge commit '6ad73145bc9754a5f26093eee5a34473ba0cff82' into feature/qed-fvol 2016-12-15 19:48:58 +00:00
67d72000e7 Hadrons: more legal banner fixes 2016-12-15 18:26:39 +00:00
80cef1c78f Hadrons: legal banner fix 2016-12-15 18:21:52 +00:00
91e98b1dd5 Merge branch 'feature/hadrons' into develop 2016-12-15 18:15:56 +00:00
b791c274b0 Revert "AVX: uninitialised variable fix"
This reverts commit c22c3db9ad.
2016-12-15 18:15:35 +00:00
596dd570c7 Linux linking fix 2016-12-15 12:26:53 +00:00
cad158e42f Hadrons: tests improvement 2016-12-14 19:41:51 +00:00
f63fac0c69 Hadrons: the XML runner can use a precomputed schedule 2016-12-14 19:41:30 +00:00
ab92de89ab Hadrons: utility to schedule a run 2016-12-14 19:41:04 +00:00
846272b037 Hadrons: option to save and load a schedule 2016-12-14 19:40:36 +00:00
f3e49e4b73 Hadrons: module templates update 2016-12-14 18:19:46 +00:00
decbb61ec1 Hadrons: XML driven program is again a binary installed with Grid 2016-12-14 18:19:24 +00:00
7e2482aad1 Hadrons: cpde cleaning 2016-12-14 18:04:21 +00:00
e1653a9f94 Hadrons: size fix in DWF module 2016-12-14 18:02:36 +00:00
ea40854e0b Hadrons: type names are demangled 2016-12-14 18:02:18 +00:00
34df71e755 Hadrons: function to save an application as an XML file 2016-12-14 18:01:56 +00:00
3af663e17b Hadrons: modules remember their factory registration name 2016-12-14 17:59:45 +00:00
Azusa Yamaguchi
0cd6b1858c Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-12-14 09:23:22 +00:00
Guido Cossu
0bd296dda4 Adding check of the Dag part in the benchmark 2016-12-14 03:15:09 +00:00
Guido Cossu
af0ccdd8e9 Moving output order 2016-12-14 02:02:42 +00:00
c22c3db9ad AVX: uninitialised variable fix 2016-12-13 19:05:58 +00:00
013e710c7d Hadrons: 3pt function test improvement 2016-12-13 19:04:43 +00:00
16693bd69d Hadrons: scheduler heuristic benchmark 2016-12-13 19:02:32 +00:00
de8f80cf94 Hadrons: genetic operators improvement 2016-12-13 19:02:05 +00:00
Guido Cossu
2fb92dbc6e Cleaning up previous debug lines 2016-12-13 07:53:43 +00:00
Guido Cossu
5c74b6028b Commit for debugging, lot of IO 2016-12-13 06:35:30 +00:00
Guido Cossu
e0be2b6e6c Adding a new tests for the Ls vec CG 2016-12-13 04:59:18 +00:00
Guido Cossu
ef72f322d2 consistency of tests 2016-12-13 02:24:20 +00:00
Azusa Yamaguchi
426197e446 Nc=3 2016-12-12 09:10:54 +00:00
Azusa Yamaguchi
99e2c1e666 Kernels options 2016-12-12 09:08:53 +00:00
Azusa Yamaguchi
1440565a10 Decrease verbosity 2016-12-12 09:08:04 +00:00
Azusa Yamaguchi
e9f0c0ea39 Staggered kernels options 2016-12-12 09:07:38 +00:00
Guido Cossu
7bc2065113 Adding report at the end of the DWF HMC tests 2016-12-12 04:21:34 +00:00
4a87486365 Hadrons: a bit of cleaning in the scheduler 2016-12-10 21:14:13 +01:00
Peter Boyle
fe187e9ed3 Compiles and passes under ZMobius with assembler 2016-12-10 00:47:48 +00:00
Peter Boyle
0091b50f49 Zmobius working -- not asm yet 2016-12-09 22:51:32 +00:00
Peter Boyle
fb8d4b2357 Lots of debug on performance Mobius 2016-12-08 17:28:28 +00:00
Peter Boyle
ff71a8e847 Ready for sim 2016-12-08 17:00:32 +00:00
Peter Boyle
83fa038bdf Streaming stores 2016-12-08 16:58:42 +00:00
Peter Boyle
7a61feb6d3 Allocator added with caching for Linux VM subsystem optimisation 2016-12-08 16:58:01 +00:00
Peter Boyle
69ae817d1c Updates for supporting Mobius better 2016-12-08 16:43:28 +00:00
Guido Cossu
2bd4233919 Completed testing of the HMC for Ls vectorised version (on AVX2) 2016-12-07 04:56:37 +00:00
Guido Cossu
143c70e29f Debugged the threaded version. Cleaning up 2016-12-07 04:40:25 +00:00
51322da6f8 Hadrons: genetic scheduler improvement 2016-12-07 09:00:45 +09:00
49c3eeb378 Hadrons: more verbose genetic parameters 2016-12-07 08:59:58 +09:00
c56707e003 useless debug message removed 2016-12-07 08:59:20 +09:00
Guido Cossu
b812d5e39c Added single threaded version of the derivative for the Ls vectorised DWF 2016-12-06 16:31:13 +00:00
5b3edf08a4 Hadrons: sequential gamma source 2016-12-06 12:13:19 +09:00
bd1d1cca34 Hadrons: code cleaning 2016-12-06 12:12:59 +09:00
646b11f5c2 Hadrons: exposing scheduler settings 2016-12-06 12:12:05 +09:00
a683a0f55a Hadrons: meson tests renamed spectrum 2016-12-06 12:11:44 +09:00
e6effcfd95 Hadrons: more contractions in the spectrum test 2016-12-05 17:41:58 +09:00
aa016f61b9 Hadrons: empty baryon contractions 2016-12-05 17:26:57 +09:00
d42a1b73c4 Hadrons: code cleaning 2016-12-05 17:26:36 +09:00
d292657ef7 Hadrons: more module templates 2016-12-05 17:26:17 +09:00
d1f7c6b94e Hadrons: templatisation of the fermion implementation 2016-12-05 16:47:29 +09:00
7ae734103e Hadrons: namespace macro to tackle GCC 5 bug 2016-12-05 14:29:32 +09:00
Guido Cossu
01480da0a8 Merge branch 'develop' into feature/hmc_generalise 2016-12-05 05:10:27 +00:00
7a1ac45679 Hadrons: configure.ac Linux typo 2016-12-05 14:00:10 +09:00
320268fe93 Hadrons: code cleaning 2016-12-05 13:57:34 +09:00
dd6fb140c5 Hadrons: big module reorganisation 2016-12-05 13:53:31 +09:00
0b4f680d28 Hadrons: meson run test 2016-12-05 11:44:58 +09:00
a69086ba1f Hadrons: application run minor fixes 2016-12-05 11:44:36 +09:00
7433eed274 Hadrons: module creation fix 2016-12-05 11:44:16 +09:00
ee5b1fe043 Hadrons: freeing object message fix 2016-12-05 09:08:45 +09:00
1540616b22 Hadrons: integer types cleanup 2016-12-05 08:53:48 +09:00
8190523e4c Hadrons: type fix in module creation 2016-12-02 11:04:34 +09:00
b5555d85a7 Hadrons: generelalised FImpl for actions 2016-12-02 11:04:15 +09:00
Peter Boyle
e27c6b217c Updating 2016-12-01 12:42:53 +00:00
9ad3d3453e Hadrons is now a library, the previous XML driven program is now a test 2016-12-01 21:36:29 +09:00
Peter Boyle
f7a6b8e5ed Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-12-01 11:39:52 +00:00
paboyle
6adf35da54 Faster Mobius 2016-12-01 11:39:04 +00:00
d8b716d2cd Hadrons: static initialisation fixed 2016-12-01 15:43:16 +09:00
Peter Boyle
cd01c1dbe9 Ls 16 more relevant 2016-11-30 22:11:10 +00:00
James Harrison
6ad73145bc Calculate Wilson loop average over multiple configurations. 2016-11-30 15:17:22 +00:00
paboyle
bd0430b34f Serialisation in malloc fixed 2016-11-29 22:27:55 +00:00
Azusa Yamaguchi
c097fd041a Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-29 13:44:17 +00:00
Azusa Yamaguchi
77fb25fb29 Push 5d tests 2016-11-29 13:43:56 +00:00
Azusa Yamaguchi
389e0a77bd Staggerd Fermion 5D 2016-11-29 13:13:56 +00:00
paboyle
2f92b4860b Test the full Mooee sector 2016-11-29 00:15:08 +00:00
paboyle
4704f2d009 Actions updated 2016-11-29 00:14:36 +00:00
Guido Cossu
ae9688e343 Reporting also the total mflops 2016-11-28 11:37:02 +00:00
43928846f2 first steps to make Hadrons a library 2016-11-28 16:02:15 +09:00
fabcd4179d Hadrons: propagator type coming from the fermion implementation 2016-11-28 14:02:10 +09:00
a8843c9af6 Code cleaning, the fermion implementation can be sepcified using the macro FIMPL 2016-11-27 16:47:22 +09:00
7a1a7a685e Merge branch 'feature/fft-opt' into feature/hadrons 2016-11-27 15:32:03 +09:00
Guido Cossu
1e44fd3094 Added some details on the mpi flags for Cray machines 2016-11-26 18:30:53 +00:00
Guido Cossu
d8258f0758 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-11-26 18:25:32 +00:00
Guido Cossu
6c0cc5676b Adding Eigen.inc to the gitignore 2016-11-26 18:25:12 +00:00
f7293f2ddb Merge pull request #69 from jch1g10/feature/qed-fvol 2016-11-26 07:04:04 +09:00
11dc0b398b Merge pull request #74 from Lanny91/develop 2016-11-26 07:01:51 +09:00
Lanny91
b18950f776 Added simd real divide test with QPX divide fixes 2016-11-25 13:21:33 +00:00
Lanny91
0acbf77bc6 Add QPX Div structure 2016-11-24 13:24:12 +00:00
3cdf945d84 Test_fftf fix 2016-11-24 09:10:03 +09:00
5833f247fa more FFt optimisations 2016-11-24 09:09:48 +09:00
Azusa Yamaguchi
95f43d27ae Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-22 13:49:22 +00:00
Azusa Yamaguchi
668ca57702 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-22 13:49:11 +00:00
a2cffb0304 AVXFMA target fixed 2016-11-21 17:47:18 +01:00
bafbac6ac4 Merge branch 'feature/gen-simd' into develop 2016-11-19 13:45:30 +01:00
595f1ce371 GEN SIMD build fix 2016-11-19 13:45:12 +01:00
6d7cde4eb4 README update 2016-11-19 13:17:35 +01:00
97cddda49e Merge branch 'feature/gen-simd' into feature/doxygen
# Conflicts:
#	Makefile.am
#	configure.ac
2016-11-19 13:11:13 +01:00
433afd36f5 Makefile rule for simple_* objects 2016-11-19 01:33:13 +01:00
b873504b90 fully generic SIMD 2016-11-19 01:32:39 +01:00
Guido Cossu
62749d05a6 Naming the scalar action 2016-11-17 12:26:20 +00:00
Guido Cossu
3834feb4b7 Adding action names 2016-11-16 16:46:49 +00:00
James Harrison
6b8ee7bae0 Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-11-15 13:08:08 +00:00
James Harrison
739c2308b5 Set imaginary part of stochastic QED field to zero using real() instead of conjugate(). 2016-11-15 13:07:52 +00:00
Guido Cossu
454302414d Small modif at the test hmc 2016-11-15 12:31:13 +00:00
042ae5b87c generic 256bits SIMD 2016-11-15 12:16:15 +00:00
James Harrison
a71b69389b QedFVol: calculate square Wilson loops up to 10x10 2016-11-14 18:23:04 +00:00
James Harrison
d49e502f53 Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-11-14 18:00:33 +00:00
James Harrison
92ec3404f8 Set imaginary part of stochastic QED field to zero after FFT into position space 2016-11-14 17:59:02 +00:00
James Harrison
f4ebea3381 QedFVol: add functions for computing spatial and timelike Wilson loops 2016-11-14 17:51:53 +00:00
James Harrison
cf167d0cd1 QedFVol: implement exponentiation of photon field 2016-11-14 17:02:29 +00:00
Guido Cossu
6f8b771a37 Adding date of the last commit 2016-11-10 18:52:00 +00:00
Guido Cossu
4e1ffdd17c Adding git info to the configure output 2016-11-10 18:44:36 +00:00
1aa695cd78 Hadrons: merge typo 2016-11-10 18:38:30 +00:00
Guido Cossu
a783282b8b Merge branch 'develop' into feature/hmc_generalise 2016-11-10 18:13:07 +00:00
Guido Cossu
19b85d8486 Some comments in the hmc files 2016-11-10 17:55:58 +00:00
paboyle
58f4950652 Merge branch 'release/v0.6.0' into develop 2016-11-09 12:44:00 +00:00
James Harrison
c30d96ea50 QedFVol: x86intrin.h namespace fix 2016-11-09 11:06:20 +00:00
13a8997789 Merge branch 'release/v0.6.0' into feature/hadrons
# Conflicts:
#	Makefile.am
2016-11-08 20:43:39 +00:00
James Harrison
7ffe17ada1 Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-11-08 14:52:43 +00:00
Azusa Yamaguchi
ee686a7d85 Compiles now 2016-11-03 16:58:23 +00:00
Azusa Yamaguchi
1c5b7a6be5 Staggered phases first cut, c1, c2, u0 2016-11-03 16:26:56 +00:00
330a9b3f4c Merge pull request #65 from jch1g10/feature/qed-fvol 2016-11-01 19:53:25 +00:00
James Harrison
28ff66a381 Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-11-01 16:07:46 +00:00
James Harrison
78c7bcee36 QedFVol: Change variables of type "double" to type "Real". 2016-11-01 16:06:05 +00:00
Azusa Yamaguchi
164d3691db Staggered 2016-11-01 14:24:22 +00:00
00a7b95631 Merge remote-tracking branch 'gh-james/feature/qed-fvol' into feature/qed-fvol 2016-10-31 18:46:23 +00:00
94d8321d01 Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-10-31 18:41:30 +00:00
James Harrison
ac24cc9f99 Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-10-29 11:05:26 +01:00
Guido Cossu
1d666771f9 Debugging the RNG, eliminate the barrier after broadcast 2016-10-26 16:08:23 +01:00
Guido Cossu
d50055cd96 Making the ILDG support optional 2016-10-26 09:48:01 +01:00
James Harrison
3ab4c8c0bb QedFVol: calculate plaquette and 2x2 Wilson loop of stochastic QED field 2016-10-25 13:32:02 +01:00
Guido Cossu
47c7159177 ILDG reader/writer works
Fill the xml header with the required information, todo.
2016-10-24 21:57:54 +01:00
Guido Cossu
f415db583a Adding ILDG format 2016-10-24 15:48:22 +01:00
Guido Cossu
f55c16f984 Adding a barrier in the RNG save 2016-10-24 11:02:14 +01:00
Guido Cossu
df67e013ca More debug output for the RNG 2016-10-22 13:34:17 +01:00
Guido Cossu
3e990c9d0a Reverting the broadcast change 2016-10-22 13:26:43 +01:00
Guido Cossu
4b740fc8fd Debugging the RNG state save 2016-10-22 13:06:00 +01:00
Guido Cossu
cccd14b09e Small cleanup 2016-10-21 17:20:54 +01:00
Guido Cossu
e6acffdfc2 Fixing the plaquette computation 2016-10-21 16:06:34 +01:00
26d124283e Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-10-21 15:23:31 +01:00
0d889b7041 QedFVol: first attempt at generating a QED field 2016-10-21 15:21:32 +01:00
ab31ad006a Merge branch 'feature/feynman-rules' into feature/qed-fvol 2016-10-21 14:42:18 +01:00
Guido Cossu
392130a537 Working on the 5d 2016-10-21 14:22:25 +01:00
Guido Cossu
deef2673b2 Separating the Lattice theories stub from the QCD.h file 2016-10-20 17:24:08 +01:00
Guido Cossu
977b0a6dd9 Merge branch 'develop' into feature/hmc_generalise 2016-10-20 17:04:41 +01:00
Guido Cossu
977d844394 Few modifications on stdout messages 2016-10-20 17:01:59 +01:00
6e4a06e180 qed-fvol: initial commit 2016-10-20 15:04:00 +01:00
Guido Cossu
590675e2ca Csum in hex format 2016-10-19 17:26:25 +01:00
Guido Cossu
8c65bdf6d3 Printing checksum for the RNG file 2016-10-19 16:56:11 +01:00
Guido Cossu
74f1ed3bc5 Adding some documentation for HMC 2016-10-19 10:51:13 +01:00
Guido Cossu
79270ef510 Added a test for EODWF Scaled Shamir with general HMC 2016-10-14 17:34:26 +01:00
Guido Cossu
e250e6b7bb Moving parameters outside of the HMCrunner 2016-10-14 17:22:32 +01:00
Guido Cossu
261342c15f Adding gh-pages 2016-10-13 11:51:25 +01:00
Guido Cossu
eda4dd622e Some more edit 2016-10-11 15:45:20 +01:00
Guido Cossu
c68a2b9637 Minor fix 2016-10-10 11:54:58 +01:00
Guido Cossu
293df6cd20 Generalising the HMCRunner and moving parameters to the user level 2016-10-10 11:49:55 +01:00
Guido Cossu
65f61bb3bf Reset QCD colours to 3 2016-10-10 09:46:17 +01:00
Guido Cossu
26b9740d53 Some fix for the GenericHMCrunner 2016-10-10 09:43:05 +01:00
cb02b7088f Merge branch 'develop' into feature/doxygen
# Conflicts:
#	configure.ac
2016-10-09 13:35:44 +01:00
Guido Cossu
6eb873dd96 Added scalar action phi^4
Check Norm2 output (Complex type assumption)
2016-10-07 17:28:46 +01:00
Guido Cossu
11b4c80b27 Added support for hmc and binary IO for a general field 2016-10-07 13:37:29 +01:00
Guido Cossu
c065e454c3 Adding Binrary IO, untested 2016-10-06 10:12:11 +01:00
Guido Cossu
d9b5fbd374 In the middle of adding a general binary writer 2016-10-04 11:24:08 +01:00
Guido Cossu
cfbc1a26b8 Now the gauge implementation has to take care of the Nexp 2016-10-03 16:20:06 +01:00
Guido Cossu
257f69f931 One more function to generalise the HMC integrator 2016-10-03 15:50:04 +01:00
Guido Cossu
e415260961 First cut on generalised HMC
Backward compatibility OK
2016-10-03 15:28:00 +01:00
a034e9901b Merge branch 'develop' into feature/hadrons 2016-09-20 13:49:33 +01:00
17c843700e missing doxygen.inc added 2016-08-05 15:38:21 +01:00
7b56f63a5c configure Doxygen output fix 2016-08-05 15:35:29 +01:00
b1cfb4d661 first try at a nicer Doxygen implementation 2016-08-05 15:29:18 +01:00
7ff7c7d90d Merge branch 'develop' into feature/hadrons 2016-08-04 16:22:10 +01:00
a2e9430abe Hadrons: fix after build system update 2016-08-03 17:14:32 +01:00
2485ef9c9c Merge branch 'feature/new-build' into feature/hadrons
# Conflicts:
#	Makefile.am
#	scripts/copyright
2016-08-03 16:49:16 +01:00
e0b7004f96 Merge branch 'master' into feature/hadrons 2016-07-01 15:54:34 +01:00
75fc295f6e Merge branch 'hadrons' into feature/hadrons 2016-06-14 17:51:15 +01:00
0b731b5d80 Hadrons: genetic scheduler parameter fix 2016-06-06 17:46:53 +01:00
8e2078be71 Hadrons: environment with fully generic object store 2016-06-06 17:45:37 +01:00
1826ed06a3 Merge branch 'master' into hadrons 2016-05-27 16:50:31 +01:00
3ff96c502b Merge branch 'master' into hadrons 2016-05-12 19:24:18 +01:00
15a0908bfc Merge branch 'master' into hadrons 2016-05-12 18:35:46 +01:00
bb2125962b Hadrons: finished implementation of 5D quarks 2016-05-12 18:34:42 +01:00
232fda5fe1 Hadrons: DWF action 2016-05-12 18:34:10 +01:00
2b31bf61ff Hadrons: message fix 2016-05-12 18:33:49 +01:00
afe5a94745 Hadrons: getModule with upcast 2016-05-12 18:33:36 +01:00
7ae667c767 Hadrons: module template update 2016-05-12 18:33:08 +01:00
07f0b69784 Merge branch 'master' into hadrons 2016-05-12 13:02:18 +01:00
5c06e89d69 Hadrons: code cleaning 2016-05-12 12:49:49 +01:00
3d75e0f0d1 Hadrons: MQuark fix 2016-05-12 12:02:15 +01:00
362f255100 Hadrons: module parameters can now be accessed from outside 2016-05-12 11:59:28 +01:00
3d78ed03ef Merge branch 'master' into hadrons 2016-05-11 15:21:46 +01:00
835003b3c5 Hadrons: removed useless gauge global parameters 2016-05-11 15:01:52 +01:00
328d213c9e Hadrons: FS case sensitivity fix 2016-05-11 14:44:14 +01:00
56a8d7a5bc Hadrons: build system fix 2016-05-11 10:27:14 +01:00
78198d1b04 Hadrons: size fix for module graph with one vertex 2016-05-10 20:13:28 +01:00
84fa2bdce6 Hadrons: modules moved in their own directory & utility script to add new modules 2016-05-10 20:12:48 +01:00
29dfe99e7c Hadrons: more scheduler optimizations 2016-05-10 19:19:38 +01:00
d604580e5a Hadrons: all objects/modules mapped to an integer address system to remove string operations from scheduling 2016-05-10 19:07:41 +01:00
7dfdc9baa0 Hadrons: lattice dynamic cast fix 2016-05-10 10:41:20 +01:00
9e986654e6 Hadrons: first version of the genetic scheduler 2016-05-09 14:49:06 +01:00
df3fbc477e Hadrons: code cleaning 2016-05-07 13:26:56 -07:00
bb580ae077 Hadrons: significant overhaul of the object registration system, previous version didn't allow dry runs 2016-05-07 13:19:38 -07:00
2c226753ab Hadrons: comments on graph theory algorithm complexity 2016-05-06 06:35:11 -07:00
ea0cea668e Hadrons: minor code cleaning 2016-05-05 16:13:14 -07:00
75cd72a421 Hadrons: memory management for fermion matrices, dynamic ownership in garbage collector 2016-05-04 19:11:03 -07:00
cbe52b0659 Hadrons: debug message removed 2016-05-04 12:20:33 -07:00
3aa6463ede Hadrons: general lattice store & a lot of code cleaning 2016-05-04 12:17:27 -07:00
312637e5fb Merge branch 'master' into hadrons
# Conflicts:
#	lib/Log.h
2016-05-04 12:16:18 -07:00
798d8f7340 Hadrons: Modules: better log messages 2016-05-03 18:17:58 -07:00
ba878724ce Hadrons: sources are now independent modules 2016-05-03 18:17:28 -07:00
b865dd9da8 Hadrons: solver renaming 2016-05-03 18:16:57 -07:00
8b313a35ac Hadrons: random and NERSC gauge configurations 2016-05-03 17:08:42 -07:00
02ec23cdad Hadrons: Fermion actions and gauge fields are modules now 2016-05-03 17:08:42 -07:00
6e83b6a203 Hadrons: namespace reorganisation, now everything is in Grid::Hadrons, the 'using Grid::operator<<' statement is used to prevent a very nasty compilation error with GCC. 2016-05-02 19:31:21 -07:00
48fcc34d72 CMeson: first implementation, still need proper output 2016-05-01 18:31:40 -07:00
d08d93c44c Merge branch 'master' into hadrons 2016-05-01 18:30:44 -07:00
0ab10cdedb Merge branch 'master' into hadrons 2016-05-01 16:08:05 -07:00
22653edf12 Merge branch 'master' into hadrons 2016-05-01 15:55:58 -07:00
12d2a95846 Merge branch 'master' into hadrons 2016-05-01 15:05:02 -07:00
978cf52f6b Merge branch 'master' into hadrons 2016-05-01 14:53:38 -07:00
63b730de80 Hadrons: for the moment, test with unit gauge 2016-05-01 14:50:57 -07:00
7905c5b8e5 Hadrons: Z2 source code fix 2016-05-01 14:49:45 -07:00
5e4b58ac40 Hadrons: Z2 source expression fix 2016-05-01 12:49:26 -07:00
468d8dc682 Merge branch 'master' into hadrons 2016-05-01 12:03:24 -07:00
beb11fd4ef Merge branch 'master' into hadrons 2016-05-01 10:32:24 -07:00
d7662b5175 Merge branch 'master' into hadrons 2016-04-30 00:24:59 -07:00
dc5f32e5f0 Merge branch 'master' into hadrons 2016-04-30 00:18:31 -07:00
1869d28429 Hadrons: first prototype with working inversions 2016-04-30 00:17:04 -07:00
405b175665 Merge branch 'master' into hadrons 2016-04-30 00:16:06 -07:00
e33b0f6ff7 cleaner output 2016-04-16 08:41:53 +01:00
9ee54e0db7 debug output removed 2016-04-16 08:41:28 +01:00
feae35d92c Hadrons: pass strings by value 2016-04-16 08:41:12 +01:00
3834d81181 Merge branch 'master' into hadrons 2016-04-14 15:15:45 +01:00
179e82b5ca Merge branch 'master' into hadrons 2016-03-08 12:55:33 +00:00
f2c59c8730 Merge branch 'master' into hadrons 2016-03-02 17:15:05 +00:00
fdd0848593 Hadrons: license text update 2016-02-25 12:07:21 +00:00
92f666905f copyright script update to 80 column text 2016-02-25 12:06:24 +00:00
5980fa8640 test implementation of DWF inverter 2016-02-25 11:56:16 +00:00
a0d8eb2c24 minor code cleaning 2016-02-23 16:33:00 +00:00
1e10b4571d fix after Grid update 2016-02-23 16:21:45 +00:00
02f8b84ac9 Merge branch 'master' into hadrons 2016-02-23 16:13:39 +00:00
cfd368596d Merge branch 'master' into hadrons 2016-02-22 15:25:02 +00:00
ae682674e0 Hadrons: first full implementation of the scheduler 2016-01-13 20:23:51 -08:00
17c43f49ac Hadrons: application class now take parameter file name as argument 2016-01-13 20:22:37 -08:00
30146e977c gitignore update 2016-01-13 20:20:43 -08:00
54eacec261 Hadrons: namespace std not used anymore in compiled sources 2015-12-23 14:30:33 +00:00
76c78f04e2 Hadrons: first complete prototype for run loop 2015-12-23 14:21:35 +00:00
379580cd89 Merge branch 'master' into hadrons 2015-12-23 14:20:22 +00:00
14a80733f9 Merge branch 'master' into hadrons 2015-12-08 13:57:53 +00:00
d4db009a58 Hadrons: starting scheduler implementation 2015-12-07 18:26:38 +00:00
20ce7e0270 Hadrons: algorithm to determine all possible topological ordering 2015-12-07 15:46:36 +00:00
bb195607ab Hadrons: fix in topological sort algorithm name 2015-12-02 19:40:11 +00:00
6f090e22c0 Hadrons: graph topological sort 2015-12-02 19:33:34 +00:00
339e983172 Merge branch 'master' into hadrons 2015-12-02 14:38:04 +00:00
4a7f3d1b7b Merge branch 'master' into hadrons
# Conflicts:
#	configure
2015-12-02 10:57:51 +00:00
c4e2202550 First graph class implementation and test 2015-11-05 14:28:14 +00:00
538b16610b First commit for measurement software 'Hadrons' 2015-10-27 17:33:18 +00:00
688 changed files with 121640 additions and 20880 deletions

30
.gitignore vendored
View File

@@ -9,6 +9,7 @@
################ ################
*~ *~
*# *#
*.sublime-*
# Precompiled Headers # # Precompiled Headers #
####################### #######################
@@ -47,7 +48,9 @@ Config.h.in
config.log config.log
config.status config.status
.deps .deps
*.inc Make.inc
eigen.inc
Eigen.inc
# http://www.gnu.org/software/autoconf # # http://www.gnu.org/software/autoconf #
######################################## ########################################
@@ -89,6 +92,8 @@ build*/*
##################### #####################
*.xcodeproj/* *.xcodeproj/*
build.sh build.sh
.vscode
*.code-workspace
# Eigen source # # Eigen source #
################ ################
@@ -102,3 +107,26 @@ lib/fftw/*
################## ##################
m4/lt* m4/lt*
m4/libtool.m4 m4/libtool.m4
# github pages #
################
gh-pages/
# Buck files #
##############
.buck*
buck-out
BUCK
make-bin-BUCK.sh
# generated sources #
#####################
lib/qcd/spin/gamma-gen/*.h
lib/qcd/spin/gamma-gen/*.cc
lib/version.h
# vs code editor files #
########################
.vscode/
.vscode/settings.json
settings.json

View File

@@ -7,64 +7,8 @@ cache:
matrix: matrix:
include: include:
- os: osx - os: osx
osx_image: xcode7.2 osx_image: xcode8.3
compiler: clang compiler: clang
- compiler: gcc
addons:
apt:
sources:
- ubuntu-toolchain-r-test
packages:
- g++-4.9
- libmpfr-dev
- libgmp-dev
- libmpc-dev
- libopenmpi-dev
- openmpi-bin
- binutils-dev
env: VERSION=-4.9
- compiler: gcc
addons:
apt:
sources:
- ubuntu-toolchain-r-test
packages:
- g++-5
- libmpfr-dev
- libgmp-dev
- libmpc-dev
- libopenmpi-dev
- openmpi-bin
- binutils-dev
env: VERSION=-5
- compiler: clang
addons:
apt:
sources:
- ubuntu-toolchain-r-test
packages:
- g++-4.8
- libmpfr-dev
- libgmp-dev
- libmpc-dev
- libopenmpi-dev
- openmpi-bin
- binutils-dev
env: CLANG_LINK=http://llvm.org/releases/3.8.0/clang+llvm-3.8.0-x86_64-linux-gnu-ubuntu-14.04.tar.xz
- compiler: clang
addons:
apt:
sources:
- ubuntu-toolchain-r-test
packages:
- g++-4.8
- libmpfr-dev
- libgmp-dev
- libmpc-dev
- libopenmpi-dev
- openmpi-bin
- binutils-dev
env: CLANG_LINK=http://llvm.org/releases/3.7.0/clang+llvm-3.7.0-x86_64-linux-gnu-ubuntu-14.04.tar.xz
before_install: before_install:
- export GRIDDIR=`pwd` - export GRIDDIR=`pwd`
@@ -73,13 +17,17 @@ before_install:
- if [[ "$TRAVIS_OS_NAME" == "linux" ]] && [[ "$CC" == "clang" ]]; then export LD_LIBRARY_PATH="${GRIDDIR}/clang/lib:${LD_LIBRARY_PATH}"; fi - if [[ "$TRAVIS_OS_NAME" == "linux" ]] && [[ "$CC" == "clang" ]]; then export LD_LIBRARY_PATH="${GRIDDIR}/clang/lib:${LD_LIBRARY_PATH}"; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew update; fi - if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew update; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install libmpc; fi - if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install libmpc; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install openmpi; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]] && [[ "$CC" == "gcc" ]]; then brew install gcc5; fi
install: install:
- export CWD=`pwd`
- echo $CWD
- export CC=$CC$VERSION - export CC=$CC$VERSION
- export CXX=$CXX$VERSION - export CXX=$CXX$VERSION
- echo $PATH - echo $PATH
- which autoconf
- autoconf --version
- which automake
- automake --version
- which $CC - which $CC
- $CC --version - $CC --version
- which $CXX - which $CXX
@@ -90,17 +38,23 @@ script:
- ./bootstrap.sh - ./bootstrap.sh
- mkdir build - mkdir build
- cd build - cd build
- ../configure --enable-precision=single --enable-simd=SSE4 --enable-comms=none - mkdir lime
- cd lime
- mkdir build
- cd build
- wget http://usqcd-software.github.io/downloads/c-lime/lime-1.3.2.tar.gz
- tar xf lime-1.3.2.tar.gz
- cd lime-1.3.2
- ./configure --prefix=$CWD/build/lime/install
- make -j4 - make -j4
- ./benchmarks/Benchmark_dwf --threads 1 - make install
- cd $CWD/build
- ../configure --enable-precision=single --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install
- make -j4
- ./benchmarks/Benchmark_dwf --threads 1 --debug-signals
- echo make clean - echo make clean
- ../configure --enable-precision=double --enable-simd=SSE4 --enable-comms=none - ../configure --enable-precision=double --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install
- make -j4 - make -j4
- ./benchmarks/Benchmark_dwf --threads 1 - ./benchmarks/Benchmark_dwf --threads 1 --debug-signals
- echo make clean - make check
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then export CXXFLAGS='-DMPI_UINT32_T=MPI_UNSIGNED -DMPI_UINT64_T=MPI_UNSIGNED_LONG'; fi
- ../configure --enable-precision=single --enable-simd=SSE4 --enable-comms=mpi-auto
- make -j4
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then mpirun.openmpi -n 2 ./benchmarks/Benchmark_dwf --threads 1 --mpi 2.1.1.1; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then mpirun -n 2 ./benchmarks/Benchmark_dwf --threads 1 --mpi 2.1.1.1; fi

View File

@@ -1,10 +1,21 @@
# additional include paths necessary to compile the C++ library # additional include paths necessary to compile the C++ library
SUBDIRS = lib benchmarks tests SUBDIRS = lib benchmarks tests extras
.PHONY: tests include $(top_srcdir)/doxygen.inc
tests: all bin_SCRIPTS=grid-config
$(MAKE) -C tests tests
BUILT_SOURCES = version.h
version.h:
echo "`git log -n 1 --format=format:"#define GITHASH \\"%H:%d\\"%n" HEAD`" > $(srcdir)/lib/version.h
.PHONY: bench check tests doxygen-run doxygen-doc $(DX_PS_GOAL) $(DX_PDF_GOAL)
tests-local: all
bench-local: all
check-local: all
AM_CXXFLAGS += -I$(top_builddir)/include AM_CXXFLAGS += -I$(top_builddir)/include
ACLOCAL_AMFLAGS = -I m4 ACLOCAL_AMFLAGS = -I m4

310
README.md
View File

@@ -1,41 +1,13 @@
# Grid # Grid [![Teamcity status](http://ci.cliath.ph.ed.ac.uk/app/rest/builds/aggregated/strob:(buildType:(affectedProject(id:Grid)),branch:name:develop)/statusIcon.svg)](http://ci.cliath.ph.ed.ac.uk/project.html?projectId=Grid&tab=projectOverview) [![Travis status](https://travis-ci.org/paboyle/Grid.svg?branch=develop)](https://travis-ci.org/paboyle/Grid)
<table>
<tr>
<td>Last stable release</td>
<td><a href="https://travis-ci.org/paboyle/Grid">
<img src="https://travis-ci.org/paboyle/Grid.svg?branch=master"></a>
</td>
</tr>
<tr>
<td>Development branch</td>
<td><a href="https://travis-ci.org/paboyle/Grid">
<img src="https://travis-ci.org/paboyle/Grid.svg?branch=develop"></a>
</td>
</tr>
</table>
**Data parallel C++ mathematical object library.** **Data parallel C++ mathematical object library.**
License: GPL v2. License: GPL v2.
Last update Nov 2016. Last update June 2017.
_Please do not send pull requests to the `master` branch which is reserved for releases._ _Please do not send pull requests to the `master` branch which is reserved for releases._
### Bug report
_To help us tracking and solving more efficiently issues with Grid, please report problems using the issue system of GitHub rather than sending emails to Grid developers._
When you file an issue, please go though the following checklist:
1. Check that the code is pointing to the `HEAD` of `develop` or any commit in `master` which is tagged with a version number.
2. Give a description of the target platform (CPU, network, compiler). Please give the full CPU part description, using for example `cat /proc/cpuinfo | grep 'model name' | uniq` (Linux) or `sysctl machdep.cpu.brand_string` (macOS) and the full output the `--version` option of your compiler.
3. Give the exact `configure` command used.
4. Attach `config.log`.
5. Attach `config.summary`.
6. Attach the output of `make V=1`.
7. Describe the issue and any previous attempt to solve it. If relevant, show how to reproduce the issue using a minimal working example.
### Description ### Description
@@ -58,13 +30,68 @@ optimally use MPI, OpenMP and SIMD parallelism under the hood. This is a signifi
for most programmers. for most programmers.
The layout transformations are parametrised by the SIMD vector length. This adapts according to the architecture. The layout transformations are parametrised by the SIMD vector length. This adapts according to the architecture.
Presently SSE4 (128 bit) AVX, AVX2, QPX (256 bit), IMCI, and AVX512 (512 bit) targets are supported (ARM NEON on the way). Presently SSE4, ARM NEON (128 bits) AVX, AVX2, QPX (256 bits), IMCI and AVX512 (512 bits) targets are supported.
These are presented as `vRealF`, `vRealD`, `vComplexF`, and `vComplexD` internal vector data types. These may be useful in themselves for other programmers. These are presented as `vRealF`, `vRealD`, `vComplexF`, and `vComplexD` internal vector data types.
The corresponding scalar types are named `RealF`, `RealD`, `ComplexF` and `ComplexD`. The corresponding scalar types are named `RealF`, `RealD`, `ComplexF` and `ComplexD`.
MPI, OpenMP, and SIMD parallelism are present in the library. MPI, OpenMP, and SIMD parallelism are present in the library.
Please see https://arxiv.org/abs/1512.03487 for more detail. Please see [this paper](https://arxiv.org/abs/1512.03487) for more detail.
### Compilers
Intel ICPC v16.0.3 and later
Clang v3.5 and later (need 3.8 and later for OpenMP)
GCC v4.9.x (recommended)
GCC v6.3 and later
### Important:
Some versions of GCC appear to have a bug under high optimisation (-O2, -O3).
The safety of these compiler versions cannot be guaranteed at this time. Follow Issue 100 for details and updates.
GCC v5.x
GCC v6.1, v6.2
### Bug report
_To help us tracking and solving more efficiently issues with Grid, please report problems using the issue system of GitHub rather than sending emails to Grid developers._
When you file an issue, please go though the following checklist:
1. Check that the code is pointing to the `HEAD` of `develop` or any commit in `master` which is tagged with a version number.
2. Give a description of the target platform (CPU, network, compiler). Please give the full CPU part description, using for example `cat /proc/cpuinfo | grep 'model name' | uniq` (Linux) or `sysctl machdep.cpu.brand_string` (macOS) and the full output the `--version` option of your compiler.
3. Give the exact `configure` command used.
4. Attach `config.log`.
5. Attach `grid.config.summary`.
6. Attach the output of `make V=1`.
7. Describe the issue and any previous attempt to solve it. If relevant, show how to reproduce the issue using a minimal working example.
### Required libraries
Grid requires:
[GMP](https://gmplib.org/),
[MPFR](http://www.mpfr.org/)
Bootstrapping grid downloads and uses for internal dense matrix (non-QCD operations) the Eigen library.
Grid optionally uses:
[HDF5](https://support.hdfgroup.org/HDF5/)
[LIME](http://usqcd-software.github.io/c-lime/) for ILDG and SciDAC file format support.
[FFTW](http://www.fftw.org) either generic version or via the Intel MKL library.
LAPACK either generic version or Intel MKL library.
### Quick start ### Quick start
First, start by cloning the repository: First, start by cloning the repository:
@@ -95,10 +122,10 @@ install Grid. Other options are detailed in the next section, you can also use `
`CXX`, `CXXFLAGS`, `LDFLAGS`, ... environment variables can be modified to `CXX`, `CXXFLAGS`, `LDFLAGS`, ... environment variables can be modified to
customise the build. customise the build.
Finally, you can build and install Grid: Finally, you can build, check, and install Grid:
``` bash ``` bash
make; make install make; make check; make install
``` ```
To minimise the build time, only the tests at the root of the `tests` directory are built by default. If you want to build tests in the sub-directory `<subdir>` you can execute: To minimise the build time, only the tests at the root of the `tests` directory are built by default. If you want to build tests in the sub-directory `<subdir>` you can execute:
@@ -116,13 +143,15 @@ If you want to build all the tests at once just use `make tests`.
- `--with-fftw=<path>`: look for FFTW in the UNIX prefix `<path>` - `--with-fftw=<path>`: look for FFTW in the UNIX prefix `<path>`
- `--enable-lapack[=<path>]`: enable LAPACK support in Lanczos eigensolver. A UNIX prefix containing the library can be specified (optional). - `--enable-lapack[=<path>]`: enable LAPACK support in Lanczos eigensolver. A UNIX prefix containing the library can be specified (optional).
- `--enable-mkl[=<path>]`: use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional). - `--enable-mkl[=<path>]`: use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional).
- `--enable-numa`: ??? - `--enable-numa`: enable NUMA first touch optimisation
- `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below.
- `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes).
- `--enable-precision={single|double}`: set the default precision (default: `double`). - `--enable-precision={single|double}`: set the default precision (default: `double`).
- `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. - `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
- `--enable-rng={ranlux48|mt19937}`: choose the RNG (default: `ranlux48 `). - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `).
- `--disable-timers`: disable system dependent high-resolution timers. - `--disable-timers`: disable system dependent high-resolution timers.
- `--enable-chroma`: enable Chroma regression tests. - `--enable-chroma`: enable Chroma regression tests.
- `--enable-doxygen-doc`: enable the Doxygen documentation generation (build with `make doxygen-doc`)
### Possible communication interfaces ### Possible communication interfaces
@@ -133,10 +162,9 @@ The following options can be use with the `--enable-comms=` option to target dif
| `none` | no communications | | `none` | no communications |
| `mpi[-auto]` | MPI communications | | `mpi[-auto]` | MPI communications |
| `mpi3[-auto]` | MPI communications using MPI 3 shared memory | | `mpi3[-auto]` | MPI communications using MPI 3 shared memory |
| `mpi3l[-auto]` | MPI communications using MPI 3 shared memory and leader model |
| `shmem ` | Cray SHMEM communications | | `shmem ` | Cray SHMEM communications |
For the MPI interfaces the optional `-auto` suffix instructs the `configure` scripts to determine all the necessary compilation and linking flags. This is done by extracting the informations from the MPI wrapper specified in the environment variable `MPICXX` (if not specified `configure` will scan though a list of default names). For the MPI interfaces the optional `-auto` suffix instructs the `configure` scripts to determine all the necessary compilation and linking flags. This is done by extracting the informations from the MPI wrapper specified in the environment variable `MPICXX` (if not specified `configure` will scan though a list of default names). The `-auto` suffix is not supported by the Cray environment wrapper scripts. Use the standard versions instead.
### Possible SIMD types ### Possible SIMD types
@@ -151,20 +179,22 @@ The following options can be use with the `--enable-simd=` option to target diff
| `AVXFMA4` | AVX (256 bit) + FMA4 | | `AVXFMA4` | AVX (256 bit) + FMA4 |
| `AVX2` | AVX 2 (256 bit) | | `AVX2` | AVX 2 (256 bit) |
| `AVX512` | AVX 512 bit | | `AVX512` | AVX 512 bit |
| `QPX` | QPX (256 bit) | | `NEONv8` | [ARM NEON](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch07s03.html) (128 bit) |
| `QPX` | IBM QPX (256 bit) |
Alternatively, some CPU codenames can be directly used: Alternatively, some CPU codenames can be directly used:
| `<code>` | Description | | `<code>` | Description |
| ----------- | -------------------------------------- | | ----------- | -------------------------------------- |
| `KNC` | [Intel Xeon Phi codename Knights Corner](http://ark.intel.com/products/codename/57721/Knights-Corner) |
| `KNL` | [Intel Xeon Phi codename Knights Landing](http://ark.intel.com/products/codename/48999/Knights-Landing) | | `KNL` | [Intel Xeon Phi codename Knights Landing](http://ark.intel.com/products/codename/48999/Knights-Landing) |
| `SKL` | [Intel Skylake with AVX512 extensions](https://ark.intel.com/products/codename/37572/Skylake#@server) |
| `BGQ` | Blue Gene/Q | | `BGQ` | Blue Gene/Q |
#### Notes: #### Notes:
- We currently support AVX512 only for the Intel compiler. Support for GCC and clang will appear in future versions of Grid when the AVX512 support within GCC and clang will be more advanced. - We currently support AVX512 for the Intel compiler and GCC (KNL and SKL target). Support for clang will appear in future versions of Grid when the AVX512 support in the compiler will be more advanced.
- For BG/Q only [bgclang](http://trac.alcf.anl.gov/projects/llvm-bgq) is supported. We do not presently plan to support more compilers for this platform. - For BG/Q only [bgclang](http://trac.alcf.anl.gov/projects/llvm-bgq) is supported. We do not presently plan to support more compilers for this platform.
- BG/Q performances are currently rather poor. This is being investigated for future versions. - BG/Q performances are currently rather poor. This is being investigated for future versions.
- The vector size for the `GEN` target can be specified with the `configure` script option `--enable-gen-simd-width`.
### Build setup for Intel Knights Landing platform ### Build setup for Intel Knights Landing platform
@@ -173,21 +203,205 @@ The following configuration is recommended for the Intel Knights Landing platfor
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-precision=double\
--enable-simd=KNL \ --enable-simd=KNL \
--enable-comms=mpi-auto \ --enable-comms=mpi-auto \
--with-gmp=<path> \
--with-mpfr=<path> \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
``` ```
The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
where `<path>` is the UNIX prefix where GMP and MPFR are installed. If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-precision=double\
--enable-simd=KNL \ --enable-simd=KNL \
--enable-comms=mpi \ --enable-comms=mpi \
--with-gmp=<path> \
--with-mpfr=<path> \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
``` ```
If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed:
``` bash
--with-gmp=<path> \
--with-mpfr=<path> \
```
where `<path>` is the UNIX prefix where GMP and MPFR are installed.
Knight's Landing with Intel Omnipath adapters with two adapters per node
presently performs better with use of more than one rank per node, using shared memory
for interior communication. This is the mpi3 communications implementation.
We recommend four ranks per node for best performance, but optimum is local volume dependent.
``` bash
../configure --enable-precision=double\
--enable-simd=KNL \
--enable-comms=mpi3-auto \
--enable-mkl \
CC=icpc MPICXX=mpiicpc
```
### Build setup for Intel Haswell Xeon platform
The following configuration is recommended for the Intel Haswell platform:
``` bash
../configure --enable-precision=double\
--enable-simd=AVX2 \
--enable-comms=mpi3-auto \
--enable-mkl \
CXX=icpc MPICXX=mpiicpc
```
The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed:
``` bash
--with-gmp=<path> \
--with-mpfr=<path> \
```
where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash
../configure --enable-precision=double\
--enable-simd=AVX2 \
--enable-comms=mpi3 \
--enable-mkl \
CXX=CC CC=cc
```
Since Dual socket nodes are commonplace, we recommend MPI-3 as the default with the use of
one rank per socket. If using the Intel MPI library, threads should be pinned to NUMA domains using
```
export I_MPI_PIN=1
```
This is the default.
### Build setup for Intel Skylake Xeon platform
The following configuration is recommended for the Intel Skylake platform:
``` bash
../configure --enable-precision=double\
--enable-simd=AVX512 \
--enable-comms=mpi3 \
--enable-mkl \
CXX=mpiicpc
```
The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed:
``` bash
--with-gmp=<path> \
--with-mpfr=<path> \
```
where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash
../configure --enable-precision=double\
--enable-simd=AVX512 \
--enable-comms=mpi3 \
--enable-mkl \
CXX=CC CC=cc
```
Since Dual socket nodes are commonplace, we recommend MPI-3 as the default with the use of
one rank per socket. If using the Intel MPI library, threads should be pinned to NUMA domains using
```
export I_MPI_PIN=1
```
This is the default.
#### Expected Skylake Gold 6148 dual socket (single prec, single node 20+20 cores) performance using NUMA MPI mapping):
mpirun -n 2 benchmarks/Benchmark_dwf --grid 16.16.16.16 --mpi 2.1.1.1 --cacheblocking 2.2.2.2 --dslash-asm --shm 1024 --threads 18
TBA
### Build setup for AMD EPYC / RYZEN
The AMD EPYC is a multichip module comprising 32 cores spread over four distinct chips each with 8 cores.
So, even with a single socket node there is a quad-chip module. Dual socket nodes with 64 cores total
are common. Each chip within the module exposes a separate NUMA domain.
There are four NUMA domains per socket and we recommend one MPI rank per NUMA domain.
MPI-3 is recommended with the use of four ranks per socket,
and 8 threads per rank.
The following configuration is recommended for the AMD EPYC platform.
``` bash
../configure --enable-precision=double\
--enable-simd=AVX2 \
--enable-comms=mpi3 \
CXX=mpicxx
```
If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed:
``` bash
--with-gmp=<path> \
--with-mpfr=<path> \
```
where `<path>` is the UNIX prefix where GMP and MPFR are installed.
Using MPICH and g++ v4.9.2, best performance can be obtained using explicit GOMP_CPU_AFFINITY flags for each MPI rank.
This can be done by invoking MPI on a wrapper script omp_bind.sh to handle this.
It is recommended to run 8 MPI ranks on a single dual socket AMD EPYC, with 8 threads per rank using MPI3 and
shared memory to communicate within this node:
mpirun -np 8 ./omp_bind.sh ./Benchmark_dwf --mpi 2.2.2.1 --dslash-unroll --threads 8 --grid 16.16.16.16 --cacheblocking 4.4.4.4
Where omp_bind.sh does the following:
```
#!/bin/bash
numanode=` expr $PMI_RANK % 8 `
basecore=`expr $numanode \* 16`
core0=`expr $basecore + 0 `
core1=`expr $basecore + 2 `
core2=`expr $basecore + 4 `
core3=`expr $basecore + 6 `
core4=`expr $basecore + 8 `
core5=`expr $basecore + 10 `
core6=`expr $basecore + 12 `
core7=`expr $basecore + 14 `
export GOMP_CPU_AFFINITY="$core0 $core1 $core2 $core3 $core4 $core5 $core6 $core7"
echo GOMP_CUP_AFFINITY $GOMP_CPU_AFFINITY
$@
```
Performance:
#### Expected AMD EPYC 7601 dual socket (single prec, single node 32+32 cores) performance using NUMA MPI mapping):
mpirun -np 8 ./omp_bind.sh ./Benchmark_dwf --threads 8 --mpi 2.2.2.1 --dslash-unroll --grid 16.16.16.16 --cacheblocking 4.4.4.4
TBA
### Build setup for BlueGene/Q
To be written...
### Build setup for ARM Neon
To be written...
### Build setup for laptops, other compilers, non-cluster builds
Many versions of g++ and clang++ work with Grid, and involve merely replacing CXX (and MPICXX),
and omit the enable-mkl flag.
Single node builds are enabled with
```
--enable-comms=none
```
FFTW support that is not in the default search path may then enabled with
```
--with-fftw=<installpath>
```
BLAS will not be compiled in by default, and Lanczos will default to Eigen diagonalisation.

86
TODO
View File

@@ -1,6 +1,51 @@
TODO: TODO:
--------------- ---------------
Code item work list
a) namespaces & indentation
GRID_BEGIN_NAMESPACE();
GRID_END_NAMESPACE();
-- delete QCD namespace
b) GPU branch
- start branch
- Increase Macro use in core library support; prepare for change
- Audit volume of "device" code
- Virtual function audit
- Start port once Nvidia box is up
- Cut down volume of code for first port? How?
Physics item work list:
1)- BG/Q port and check ; Andrew says ok.
2)- Consistent linear solver flop count/rate -- PARTIAL, time but no flop/s yet
3)- Physical propagator interface
4)- Multigrid Wilson and DWF, compare to other Multigrid implementations
5)- HDCR resume
----------------------------
Recent DONE
-- RNG I/O in ILDG/SciDAC (minor)
-- Precision conversion and sort out localConvert <-- partial/easy
-- Conserved currents (Andrew)
-- Split grid
-- Christoph's local basis expansion Lanczos
-- MultiRHS with spread out extra dim -- Go through filesystem with SciDAC I/O ; <-- DONE ; bmark cori
-- Lanczos Remove DenseVector, DenseMatrix; Use Eigen instead. <-- DONE
-- GaugeFix into central location <-- DONE
-- Scidac and Ildg metadata handling <-- DONE
-- Binary I/O MPI2 IO <-- DONE
-- Binary I/O speed up & x-strips <-- DONE
-- Cut down the exterior overhead <-- DONE
-- Interior legs from SHM comms <-- DONE
-- Half-precision comms <-- DONE
-- Merge high precision reduction into develop <-- DONE
-- BlockCG, BCGrQ <-- DONE
-- multiRHS DWF; benchmark on Cori/BNL for comms elimination <-- DONE
-- slice* linalg routines for multiRHS, BlockCG
-----
* Forces; the UdSdU term in gauge force term is half of what I think it should * Forces; the UdSdU term in gauge force term is half of what I think it should
be. This is a consequence of taking ONLY the first term in: be. This is a consequence of taking ONLY the first term in:
@@ -21,16 +66,8 @@ TODO:
This means we must double the force in the Test_xxx_force routines, and is the origin of the factor of two. This means we must double the force in the Test_xxx_force routines, and is the origin of the factor of two.
This 2x is applied by hand in the fermion routines and in the Test_rect_force routine. This 2x is applied by hand in the fermion routines and in the Test_rect_force routine.
Policies:
* Link smearing/boundary conds; Policy class based implementation ; framework more in place
* Support different boundary conditions (finite temp, chem. potential ... ) * Support different boundary conditions (finite temp, chem. potential ... )
* Support different fermion representations?
- contained entirely within the integrator presently
- Sign of force term. - Sign of force term.
- Reversibility test. - Reversibility test.
@@ -41,11 +78,6 @@ Policies:
- Audit oIndex usage for cb behaviour - Audit oIndex usage for cb behaviour
- Rectangle gauge actions.
Iwasaki,
Symanzik,
... etc...
- Prepare multigrid for HMC. - Alternate setup schemes. - Prepare multigrid for HMC. - Alternate setup schemes.
- Support for ILDG --- ugly, not done - Support for ILDG --- ugly, not done
@@ -55,9 +87,11 @@ Policies:
- FFTnD ? - FFTnD ?
- Gparity; hand opt use template specialisation elegance to enable the optimised paths ? - Gparity; hand opt use template specialisation elegance to enable the optimised paths ?
- Gparity force term; Gparity (R)HMC. - Gparity force term; Gparity (R)HMC.
- Random number state save restore
- Mobius implementation clean up to rmove #if 0 stale code sequences - Mobius implementation clean up to rmove #if 0 stale code sequences
- CG -- profile carefully, kernel fusion, whole CG performance measurements. - CG -- profile carefully, kernel fusion, whole CG performance measurements.
================================================================ ================================================================
@@ -90,6 +124,7 @@ Insert/Extract
Not sure of status of this -- reverify. Things are working nicely now though. Not sure of status of this -- reverify. Things are working nicely now though.
* Make the Tensor types and Complex etc... play more nicely. * Make the Tensor types and Complex etc... play more nicely.
- TensorRemove is a hack, come up with a long term rationalised approach to Complex vs. Scalar<Scalar<Scalar<Complex > > > - TensorRemove is a hack, come up with a long term rationalised approach to Complex vs. Scalar<Scalar<Scalar<Complex > > >
QDP forces use of "toDouble" to get back to non tensor scalar. This role is presently taken TensorRemove, but I QDP forces use of "toDouble" to get back to non tensor scalar. This role is presently taken TensorRemove, but I
want to introduce a syntax that does not require this. want to introduce a syntax that does not require this.
@@ -112,6 +147,8 @@ Not sure of status of this -- reverify. Things are working nicely now though.
RECENT RECENT
--------------- ---------------
- Support different fermion representations? -- DONE
- contained entirely within the integrator presently
- Clean up HMC -- DONE - Clean up HMC -- DONE
- LorentzScalar<GaugeField> gets Gauge link type (cleaner). -- DONE - LorentzScalar<GaugeField> gets Gauge link type (cleaner). -- DONE
- Simplified the integrators a bit. -- DONE - Simplified the integrators a bit. -- DONE
@@ -123,6 +160,26 @@ RECENT
- Parallel io improvements -- DONE - Parallel io improvements -- DONE
- Plaquette and link trace checks into nersc reader from the Grid_nersc_io.cc test. -- DONE - Plaquette and link trace checks into nersc reader from the Grid_nersc_io.cc test. -- DONE
DONE:
- MultiArray -- MultiRHS done
- ConjugateGradientMultiShift -- DONE
- MCR -- DONE
- Remez -- Mike or Boost? -- DONE
- Proto (ET) -- DONE
- uBlas -- DONE ; Eigen
- Potentially Useful Boost libraries -- DONE ; Eigen
- Aligned allocator; memory pool -- DONE
- Multiprecision -- DONE
- Serialization -- DONE
- Regex -- Not needed
- Tokenize -- Why?
- Random number state save restore -- DONE
- Rectangle gauge actions. -- DONE
Iwasaki,
Symanzik,
... etc...
Done: Cayley, Partial , ContFrac force terms. Done: Cayley, Partial , ContFrac force terms.
DONE DONE
@@ -207,6 +264,7 @@ Done
FUNCTIONALITY: it pleases me to keep track of things I have done (keeps me arguably sane) FUNCTIONALITY: it pleases me to keep track of things I have done (keeps me arguably sane)
====================================================================================================== ======================================================================================================
* Link smearing/boundary conds; Policy class based implementation ; framework more in place -- DONE
* Command line args for geometry, simd, etc. layout. Is it necessary to have -- DONE * Command line args for geometry, simd, etc. layout. Is it necessary to have -- DONE
user pass these? Is this a QCD specific? user pass these? Is this a QCD specific?

View File

@@ -1,6 +1,5 @@
Version : 0.6.0 Version : 0.8.0
- AVX512, AVX2, AVX, SSE good - Clang 3.5 and above, ICPC v16 and above, GCC 6.3 and above recommended
- Clang 3.5 and above, ICPC v16 and above, GCC 4.9 and above - MPI and MPI3 comms optimisations for KNL and OPA finished
- MPI and MPI3 - Half precision comms
- HiRep, Smearing, Generic gauge group

108
benchmarks/Benchmark_IO.cc Normal file
View File

@@ -0,0 +1,108 @@
#include <Grid/Grid.h>
#ifdef HAVE_LIME
using namespace std;
using namespace Grid;
using namespace Grid::QCD;
#define MSG cout << GridLogMessage
#define SEP \
"============================================================================="
#ifndef BENCH_IO_LMAX
#define BENCH_IO_LMAX 40
#endif
typedef function<void(const string, LatticeFermion &)> WriterFn;
typedef function<void(LatticeFermion &, const string)> ReaderFn;
string filestem(const int l)
{
return "iobench_l" + to_string(l);
}
void limeWrite(const string filestem, LatticeFermion &vec)
{
emptyUserRecord record;
ScidacWriter binWriter(vec._grid->IsBoss());
binWriter.open(filestem + ".bin");
binWriter.writeScidacFieldRecord(vec, record);
binWriter.close();
}
void limeRead(LatticeFermion &vec, const string filestem)
{
emptyUserRecord record;
ScidacReader binReader;
binReader.open(filestem + ".bin");
binReader.readScidacFieldRecord(vec, record);
binReader.close();
}
void writeBenchmark(const int l, const WriterFn &write)
{
auto mpi = GridDefaultMpi();
auto simd = GridDefaultSimd(Nd, vComplex::Nsimd());
vector<int> latt = {l*mpi[0], l*mpi[1], l*mpi[2], l*mpi[3]};
unique_ptr<GridCartesian> gPt(SpaceTimeGrid::makeFourDimGrid(latt, simd, mpi));
GridCartesian *g = gPt.get();
GridParallelRNG rng(g);
LatticeFermion vec(g);
emptyUserRecord record;
ScidacWriter binWriter(g->IsBoss());
cout << "-- Local volume " << l << "^4" << endl;
random(rng, vec);
write(filestem(l), vec);
}
void readBenchmark(const int l, const ReaderFn &read)
{
auto mpi = GridDefaultMpi();
auto simd = GridDefaultSimd(Nd, vComplex::Nsimd());
vector<int> latt = {l*mpi[0], l*mpi[1], l*mpi[2], l*mpi[3]};
unique_ptr<GridCartesian> gPt(SpaceTimeGrid::makeFourDimGrid(latt, simd, mpi));
GridCartesian *g = gPt.get();
LatticeFermion vec(g);
emptyUserRecord record;
ScidacReader binReader;
cout << "-- Local volume " << l << "^4" << endl;
read(vec, filestem(l));
}
int main (int argc, char ** argv)
{
Grid_init(&argc,&argv);
auto simd = GridDefaultSimd(Nd,vComplex::Nsimd());
auto mpi = GridDefaultMpi();
int64_t threads = GridThread::GetThreads();
MSG << "Grid is setup to use " << threads << " threads" << endl;
MSG << SEP << endl;
MSG << "Benchmark Lime write" << endl;
MSG << SEP << endl;
for (int l = 4; l <= BENCH_IO_LMAX; l += 2)
{
writeBenchmark(l, limeWrite);
}
MSG << "Benchmark Lime read" << endl;
MSG << SEP << endl;
for (int l = 4; l <= BENCH_IO_LMAX; l += 2)
{
readBenchmark(l, limeRead);
}
Grid_finalize();
return EXIT_SUCCESS;
}
#else
int main (int argc, char ** argv)
{
return EXIT_SUCCESS;
}
#endif

807
benchmarks/Benchmark_ITT.cc Normal file
View File

@@ -0,0 +1,807 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: ./benchmarks/Benchmark_memory_bandwidth.cc
Copyright (C) 2015
Author: Peter Boyle <paboyle@ph.ed.ac.uk>
Author: paboyle <paboyle@ph.ed.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Grid.h>
using namespace std;
using namespace Grid;
using namespace Grid::QCD;
typedef WilsonFermion5D<DomainWallVec5dImplR> WilsonFermion5DR;
typedef WilsonFermion5D<DomainWallVec5dImplF> WilsonFermion5DF;
typedef WilsonFermion5D<DomainWallVec5dImplD> WilsonFermion5DD;
std::vector<int> L_list;
std::vector<int> Ls_list;
std::vector<double> mflop_list;
double mflop_ref;
double mflop_ref_err;
int NN_global;
struct time_statistics{
double mean;
double err;
double min;
double max;
void statistics(std::vector<double> v){
double sum = std::accumulate(v.begin(), v.end(), 0.0);
mean = sum / v.size();
std::vector<double> diff(v.size());
std::transform(v.begin(), v.end(), diff.begin(), [=](double x) { return x - mean; });
double sq_sum = std::inner_product(diff.begin(), diff.end(), diff.begin(), 0.0);
err = std::sqrt(sq_sum / (v.size()*(v.size() - 1)));
auto result = std::minmax_element(v.begin(), v.end());
min = *result.first;
max = *result.second;
}
};
void comms_header(){
std::cout <<GridLogMessage << " L "<<"\t"<<" Ls "<<"\t"
<<std::setw(11)<<"bytes"<<"MB/s uni (err/min/max)"<<"\t\t"<<"MB/s bidi (err/min/max)"<<std::endl;
};
Gamma::Algebra Gmu [] = {
Gamma::Algebra::GammaX,
Gamma::Algebra::GammaY,
Gamma::Algebra::GammaZ,
Gamma::Algebra::GammaT
};
struct controls {
int Opt;
int CommsOverlap;
Grid::CartesianCommunicator::CommunicatorPolicy_t CommsAsynch;
// int HugePages;
};
class Benchmark {
public:
static void Decomposition (void ) {
int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Grid is setup to use "<<threads<<" threads"<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage<<"Grid Default Decomposition patterns\n";
std::cout<<GridLogMessage<<"\tOpenMP threads : "<<GridThread::GetThreads()<<std::endl;
std::cout<<GridLogMessage<<"\tMPI tasks : "<<GridCmdVectorIntToString(GridDefaultMpi())<<std::endl;
std::cout<<GridLogMessage<<"\tvReal : "<<sizeof(vReal )*8 <<"bits ; " <<GridCmdVectorIntToString(GridDefaultSimd(4,vReal::Nsimd()))<<std::endl;
std::cout<<GridLogMessage<<"\tvRealF : "<<sizeof(vRealF)*8 <<"bits ; " <<GridCmdVectorIntToString(GridDefaultSimd(4,vRealF::Nsimd()))<<std::endl;
std::cout<<GridLogMessage<<"\tvRealD : "<<sizeof(vRealD)*8 <<"bits ; " <<GridCmdVectorIntToString(GridDefaultSimd(4,vRealD::Nsimd()))<<std::endl;
std::cout<<GridLogMessage<<"\tvComplex : "<<sizeof(vComplex )*8 <<"bits ; " <<GridCmdVectorIntToString(GridDefaultSimd(4,vComplex::Nsimd()))<<std::endl;
std::cout<<GridLogMessage<<"\tvComplexF : "<<sizeof(vComplexF)*8 <<"bits ; " <<GridCmdVectorIntToString(GridDefaultSimd(4,vComplexF::Nsimd()))<<std::endl;
std::cout<<GridLogMessage<<"\tvComplexD : "<<sizeof(vComplexD)*8 <<"bits ; " <<GridCmdVectorIntToString(GridDefaultSimd(4,vComplexD::Nsimd()))<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
}
static void Comms(void)
{
int Nloop=200;
int nmu=0;
int maxlat=32;
std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplexD::Nsimd());
std::vector<int> mpi_layout = GridDefaultMpi();
for(int mu=0;mu<Nd;mu++) if (mpi_layout[mu]>1) nmu++;
std::vector<double> t_time(Nloop);
time_statistics timestat;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking threaded STENCIL halo exchange in "<<nmu<<" dimensions"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
comms_header();
for(int lat=4;lat<=maxlat;lat+=4){
for(int Ls=8;Ls<=8;Ls*=2){
std::vector<int> latt_size ({lat*mpi_layout[0],
lat*mpi_layout[1],
lat*mpi_layout[2],
lat*mpi_layout[3]});
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
RealD Nrank = Grid._Nprocessors;
RealD Nnode = Grid.NodeCount();
RealD ppn = Nrank/Nnode;
std::vector<HalfSpinColourVectorD *> xbuf(8);
std::vector<HalfSpinColourVectorD *> rbuf(8);
Grid.ShmBufferFreeAll();
for(int d=0;d<8;d++){
xbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
rbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)xbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)rbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
}
int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD);
int ncomm;
double dbytes;
std::vector<double> times(Nloop);
for(int i=0;i<Nloop;i++){
double start=usecond();
dbytes=0;
ncomm=0;
#ifdef GRID_OMP
#pragma omp parallel for num_threads(Grid::CartesianCommunicator::nCommThreads)
#endif
for(int dir=0;dir<8;dir++){
double tbytes;
int mu =dir % 4;
if (mpi_layout[mu]>1 ) {
int xmit_to_rank;
int recv_from_rank;
if ( dir == mu ) {
int comm_proc=1;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
} else {
int comm_proc = mpi_layout[mu]-1;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
}
#ifdef GRID_OMP
int tid = omp_get_thread_num();
#else
int tid = dir;
#endif
tbytes= Grid.StencilSendToRecvFrom((void *)&xbuf[dir][0], xmit_to_rank,
(void *)&rbuf[dir][0], recv_from_rank,
bytes,tid);
#ifdef GRID_OMP
#pragma omp atomic
#endif
ncomm++;
#ifdef GRID_OMP
#pragma omp atomic
#endif
dbytes+=tbytes;
}
}
Grid.Barrier();
double stop=usecond();
t_time[i] = stop-start; // microseconds
}
timestat.statistics(t_time);
// for(int i=0;i<t_time.size();i++){
// std::cout << i<<" "<<t_time[i]<<std::endl;
// }
dbytes=dbytes*ppn;
double xbytes = dbytes*0.5;
double rbytes = dbytes*0.5;
double bidibytes = dbytes;
std::cout<<GridLogMessage << std::setw(4) << lat<<"\t"<<Ls<<"\t"
<<std::setw(11) << bytes<< std::fixed << std::setprecision(1) << std::setw(7)
<<std::right<< xbytes/timestat.mean<<" "<< xbytes*timestat.err/(timestat.mean*timestat.mean)<< " "
<<xbytes/timestat.max <<" "<< xbytes/timestat.min
<< "\t\t"<<std::setw(7)<< bidibytes/timestat.mean<< " " << bidibytes*timestat.err/(timestat.mean*timestat.mean) << " "
<< bidibytes/timestat.max << " " << bidibytes/timestat.min << std::endl;
}
}
return;
}
static void Memory(void)
{
const int Nvec=8;
typedef Lattice< iVector< vReal,Nvec> > LatticeVec;
typedef iVector<vReal,Nvec> Vec;
std::vector<int> simd_layout = GridDefaultSimd(Nd,vReal::Nsimd());
std::vector<int> mpi_layout = GridDefaultMpi();
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking a*x + y bandwidth"<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<< "\t\tGB/s / node"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
uint64_t NP;
uint64_t NN;
uint64_t lmax=48;
#define NLOOP (100*lmax*lmax*lmax*lmax/lat/lat/lat/lat)
GridSerialRNG sRNG; sRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
for(int lat=8;lat<=lmax;lat+=4){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int64_t vol= latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
NP= Grid.RankCount();
NN =Grid.NodeCount();
Vec rn ; random(sRNG,rn);
LatticeVec z(&Grid); z=rn;
LatticeVec x(&Grid); x=rn;
LatticeVec y(&Grid); y=rn;
double a=2.0;
uint64_t Nloop=NLOOP;
double start=usecond();
for(int i=0;i<Nloop;i++){
z=a*x-y;
x._odata[0]=z._odata[0]; // force serial dependency to prevent optimise away
y._odata[4]=z._odata[4];
}
double stop=usecond();
double time = (stop-start)/Nloop*1000;
double flops=vol*Nvec*2;// mul,add
double bytes=3.0*vol*Nvec*sizeof(Real);
std::cout<<GridLogMessage<<std::setprecision(3)
<< lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.
<< "\t\t"<< bytes/time/NN <<std::endl;
}
};
static double DWF5(int Ls,int L)
{
RealD mass=0.1;
RealD M5 =1.8;
double mflops;
double mflops_best = 0;
double mflops_worst= 0;
std::vector<double> mflops_all;
///////////////////////////////////////////////////////
// Set/Get the layout & grid size
///////////////////////////////////////////////////////
int threads = GridThread::GetThreads();
std::vector<int> mpi = GridDefaultMpi(); assert(mpi.size()==4);
std::vector<int> local({L,L,L,L});
GridCartesian * TmpGrid = SpaceTimeGrid::makeFourDimGrid(std::vector<int>({64,64,64,64}),
GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
uint64_t NP = TmpGrid->RankCount();
uint64_t NN = TmpGrid->NodeCount();
NN_global=NN;
uint64_t SHM=NP/NN;
std::vector<int> internal;
if ( SHM == 1 ) internal = std::vector<int>({1,1,1,1});
else if ( SHM == 2 ) internal = std::vector<int>({2,1,1,1});
else if ( SHM == 4 ) internal = std::vector<int>({2,2,1,1});
else if ( SHM == 8 ) internal = std::vector<int>({2,2,2,1});
else assert(0);
std::vector<int> nodes({mpi[0]/internal[0],mpi[1]/internal[1],mpi[2]/internal[2],mpi[3]/internal[3]});
std::vector<int> latt4({local[0]*nodes[0],local[1]*nodes[1],local[2]*nodes[2],local[3]*nodes[3]});
///////// Welcome message ////////////
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << "Benchmark DWF Ls vec on "<<L<<"^4 local volume "<<std::endl;
std::cout<<GridLogMessage << "* Global volume : "<<GridCmdVectorIntToString(latt4)<<std::endl;
std::cout<<GridLogMessage << "* Ls : "<<Ls<<std::endl;
std::cout<<GridLogMessage << "* MPI ranks : "<<GridCmdVectorIntToString(mpi)<<std::endl;
std::cout<<GridLogMessage << "* Intranode : "<<GridCmdVectorIntToString(internal)<<std::endl;
std::cout<<GridLogMessage << "* nodes : "<<GridCmdVectorIntToString(nodes)<<std::endl;
std::cout<<GridLogMessage << "* Using "<<threads<<" threads"<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
///////// Lattice Init ////////////
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(latt4, GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
GridCartesian * sUGrid = SpaceTimeGrid::makeFourDimDWFGrid(latt4,GridDefaultMpi());
GridRedBlackCartesian * sUrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(sUGrid);
GridCartesian * sFGrid = SpaceTimeGrid::makeFiveDimDWFGrid(Ls,UGrid);
GridRedBlackCartesian * sFrbGrid = SpaceTimeGrid::makeFiveDimDWFRedBlackGrid(Ls,UGrid);
///////// RNG Init ////////////
std::vector<int> seeds4({1,2,3,4});
std::vector<int> seeds5({5,6,7,8});
GridParallelRNG RNG4(UGrid); RNG4.SeedFixedIntegers(seeds4);
GridParallelRNG RNG5(sFGrid); RNG5.SeedFixedIntegers(seeds5);
std::cout << GridLogMessage << "Initialised RNGs" << std::endl;
///////// Source preparation ////////////
LatticeFermion src (sFGrid); random(RNG5,src);
LatticeFermion tmp (sFGrid);
RealD N2 = 1.0/::sqrt(norm2(src));
src = src*N2;
LatticeGaugeField Umu(UGrid); SU3::HotConfiguration(RNG4,Umu);
WilsonFermion5DR sDw(Umu,*sFGrid,*sFrbGrid,*sUGrid,*sUrbGrid,M5);
LatticeFermion src_e (sFrbGrid);
LatticeFermion src_o (sFrbGrid);
LatticeFermion r_e (sFrbGrid);
LatticeFermion r_o (sFrbGrid);
LatticeFermion r_eo (sFGrid);
LatticeFermion err (sFGrid);
{
pickCheckerboard(Even,src_e,src);
pickCheckerboard(Odd,src_o,src);
#if defined(AVX512)
const int num_cases = 6;
std::string fmt("A/S ; A/O ; U/S ; U/O ; G/S ; G/O ");
#else
const int num_cases = 4;
std::string fmt("U/S ; U/O ; G/S ; G/O ");
#endif
controls Cases [] = {
#ifdef AVX512
{ QCD::WilsonKernelsStatic::OptInlineAsm , QCD::WilsonKernelsStatic::CommsThenCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptInlineAsm , QCD::WilsonKernelsStatic::CommsAndCompute ,CartesianCommunicator::CommunicatorPolicySequential },
#endif
{ QCD::WilsonKernelsStatic::OptHandUnroll, QCD::WilsonKernelsStatic::CommsThenCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptHandUnroll, QCD::WilsonKernelsStatic::CommsAndCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptGeneric , QCD::WilsonKernelsStatic::CommsThenCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptGeneric , QCD::WilsonKernelsStatic::CommsAndCompute ,CartesianCommunicator::CommunicatorPolicySequential }
};
for(int c=0;c<num_cases;c++) {
QCD::WilsonKernelsStatic::Comms = Cases[c].CommsOverlap;
QCD::WilsonKernelsStatic::Opt = Cases[c].Opt;
CartesianCommunicator::SetCommunicatorPolicy(Cases[c].CommsAsynch);
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
int nwarm = 100;
uint64_t ncall = 1000;
double t0=usecond();
sFGrid->Barrier();
for(int i=0;i<nwarm;i++){
sDw.DhopEO(src_o,r_e,DaggerNo);
}
sFGrid->Barrier();
double t1=usecond();
sDw.ZeroCounters();
time_statistics timestat;
std::vector<double> t_time(ncall);
for(uint64_t i=0;i<ncall;i++){
t0=usecond();
sDw.DhopEO(src_o,r_e,DaggerNo);
t1=usecond();
t_time[i] = t1-t0;
}
sFGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=(1344.0*volume)/2;
double mf_hi, mf_lo, mf_err;
timestat.statistics(t_time);
mf_hi = flops/timestat.min;
mf_lo = flops/timestat.max;
mf_err= flops/timestat.min * timestat.err/timestat.mean;
mflops = flops/timestat.mean;
mflops_all.push_back(mflops);
if ( mflops_best == 0 ) mflops_best = mflops;
if ( mflops_worst== 0 ) mflops_worst= mflops;
if ( mflops>mflops_best ) mflops_best = mflops;
if ( mflops<mflops_worst) mflops_worst= mflops;
std::cout<<GridLogMessage << std::fixed << std::setprecision(1)<<"sDeo mflop/s = "<< mflops << " ("<<mf_err<<") " << mf_lo<<"-"<<mf_hi <<std::endl;
std::cout<<GridLogMessage << std::fixed << std::setprecision(1)<<"sDeo mflop/s per rank "<< mflops/NP<<std::endl;
std::cout<<GridLogMessage << std::fixed << std::setprecision(1)<<"sDeo mflop/s per node "<< mflops/NN<<std::endl;
sDw.Report();
}
double robust = mflops_worst/mflops_best;;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << L<<"^4 x "<<Ls<< " sDeo Best mflop/s = "<< mflops_best << " ; " << mflops_best/NN<<" per node " <<std::endl;
std::cout<<GridLogMessage << L<<"^4 x "<<Ls<< " sDeo Worst mflop/s = "<< mflops_worst<< " ; " << mflops_worst/NN<<" per node " <<std::endl;
std::cout<<GridLogMessage <<std::setprecision(3)<< L<<"^4 x "<<Ls<< " Performance Robustness = "<< robust <<std::endl;
std::cout<<GridLogMessage <<fmt << std::endl;
std::cout<<GridLogMessage;
for(int i=0;i<mflops_all.size();i++){
std::cout<<mflops_all[i]/NN<<" ; " ;
}
std::cout<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
}
return mflops_best;
}
static double DWF(int Ls,int L, double & robust)
{
RealD mass=0.1;
RealD M5 =1.8;
double mflops;
double mflops_best = 0;
double mflops_worst= 0;
std::vector<double> mflops_all;
///////////////////////////////////////////////////////
// Set/Get the layout & grid size
///////////////////////////////////////////////////////
int threads = GridThread::GetThreads();
std::vector<int> mpi = GridDefaultMpi(); assert(mpi.size()==4);
std::vector<int> local({L,L,L,L});
GridCartesian * TmpGrid = SpaceTimeGrid::makeFourDimGrid(std::vector<int>({64,64,64,64}),
GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
uint64_t NP = TmpGrid->RankCount();
uint64_t NN = TmpGrid->NodeCount();
NN_global=NN;
uint64_t SHM=NP/NN;
std::vector<int> internal;
if ( SHM == 1 ) internal = std::vector<int>({1,1,1,1});
else if ( SHM == 2 ) internal = std::vector<int>({2,1,1,1});
else if ( SHM == 4 ) internal = std::vector<int>({2,2,1,1});
else if ( SHM == 8 ) internal = std::vector<int>({2,2,2,1});
else assert(0);
std::vector<int> nodes({mpi[0]/internal[0],mpi[1]/internal[1],mpi[2]/internal[2],mpi[3]/internal[3]});
std::vector<int> latt4({local[0]*nodes[0],local[1]*nodes[1],local[2]*nodes[2],local[3]*nodes[3]});
///////// Welcome message ////////////
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << "Benchmark DWF on "<<L<<"^4 local volume "<<std::endl;
std::cout<<GridLogMessage << "* Global volume : "<<GridCmdVectorIntToString(latt4)<<std::endl;
std::cout<<GridLogMessage << "* Ls : "<<Ls<<std::endl;
std::cout<<GridLogMessage << "* MPI ranks : "<<GridCmdVectorIntToString(mpi)<<std::endl;
std::cout<<GridLogMessage << "* Intranode : "<<GridCmdVectorIntToString(internal)<<std::endl;
std::cout<<GridLogMessage << "* nodes : "<<GridCmdVectorIntToString(nodes)<<std::endl;
std::cout<<GridLogMessage << "* Using "<<threads<<" threads"<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
///////// Lattice Init ////////////
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(latt4, GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid);
GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid);
///////// RNG Init ////////////
std::vector<int> seeds4({1,2,3,4});
std::vector<int> seeds5({5,6,7,8});
GridParallelRNG RNG4(UGrid); RNG4.SeedFixedIntegers(seeds4);
GridParallelRNG RNG5(FGrid); RNG5.SeedFixedIntegers(seeds5);
std::cout << GridLogMessage << "Initialised RNGs" << std::endl;
///////// Source preparation ////////////
LatticeFermion src (FGrid); random(RNG5,src);
LatticeFermion ref (FGrid);
LatticeFermion tmp (FGrid);
RealD N2 = 1.0/::sqrt(norm2(src));
src = src*N2;
LatticeGaugeField Umu(UGrid); SU3::HotConfiguration(RNG4,Umu);
DomainWallFermionR Dw(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
////////////////////////////////////
// Naive wilson implementation
////////////////////////////////////
{
LatticeGaugeField Umu5d(FGrid);
std::vector<LatticeColourMatrix> U(4,FGrid);
for(int ss=0;ss<Umu._grid->oSites();ss++){
for(int s=0;s<Ls;s++){
Umu5d._odata[Ls*ss+s] = Umu._odata[ss];
}
}
ref = zero;
for(int mu=0;mu<Nd;mu++){
U[mu] = PeekIndex<LorentzIndex>(Umu5d,mu);
}
for(int mu=0;mu<Nd;mu++){
tmp = U[mu]*Cshift(src,mu+1,1);
ref=ref + tmp - Gamma(Gmu[mu])*tmp;
tmp =adj(U[mu])*src;
tmp =Cshift(tmp,mu+1,-1);
ref=ref + tmp + Gamma(Gmu[mu])*tmp;
}
ref = -0.5*ref;
}
LatticeFermion src_e (FrbGrid);
LatticeFermion src_o (FrbGrid);
LatticeFermion r_e (FrbGrid);
LatticeFermion r_o (FrbGrid);
LatticeFermion r_eo (FGrid);
LatticeFermion err (FGrid);
{
pickCheckerboard(Even,src_e,src);
pickCheckerboard(Odd,src_o,src);
#if defined(AVX512)
const int num_cases = 6;
std::string fmt("A/S ; A/O ; U/S ; U/O ; G/S ; G/O ");
#else
const int num_cases = 4;
std::string fmt("U/S ; U/O ; G/S ; G/O ");
#endif
controls Cases [] = {
#ifdef AVX512
{ QCD::WilsonKernelsStatic::OptInlineAsm , QCD::WilsonKernelsStatic::CommsThenCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptInlineAsm , QCD::WilsonKernelsStatic::CommsAndCompute ,CartesianCommunicator::CommunicatorPolicySequential },
#endif
{ QCD::WilsonKernelsStatic::OptHandUnroll, QCD::WilsonKernelsStatic::CommsThenCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptHandUnroll, QCD::WilsonKernelsStatic::CommsAndCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptGeneric , QCD::WilsonKernelsStatic::CommsThenCompute ,CartesianCommunicator::CommunicatorPolicySequential },
{ QCD::WilsonKernelsStatic::OptGeneric , QCD::WilsonKernelsStatic::CommsAndCompute ,CartesianCommunicator::CommunicatorPolicySequential }
};
for(int c=0;c<num_cases;c++) {
QCD::WilsonKernelsStatic::Comms = Cases[c].CommsOverlap;
QCD::WilsonKernelsStatic::Opt = Cases[c].Opt;
CartesianCommunicator::SetCommunicatorPolicy(Cases[c].CommsAsynch);
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
int nwarm = 200;
double t0=usecond();
FGrid->Barrier();
for(int i=0;i<nwarm;i++){
Dw.DhopEO(src_o,r_e,DaggerNo);
}
FGrid->Barrier();
double t1=usecond();
// uint64_t ncall = (uint64_t) 2.5*1000.0*1000.0*nwarm/(t1-t0);
// if (ncall < 500) ncall = 500;
uint64_t ncall = 1000;
FGrid->Broadcast(0,&ncall,sizeof(ncall));
// std::cout << GridLogMessage << " Estimate " << ncall << " calls per second"<<std::endl;
Dw.ZeroCounters();
time_statistics timestat;
std::vector<double> t_time(ncall);
for(uint64_t i=0;i<ncall;i++){
t0=usecond();
Dw.DhopEO(src_o,r_e,DaggerNo);
t1=usecond();
t_time[i] = t1-t0;
}
FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=(1344.0*volume)/2;
double mf_hi, mf_lo, mf_err;
timestat.statistics(t_time);
mf_hi = flops/timestat.min;
mf_lo = flops/timestat.max;
mf_err= flops/timestat.min * timestat.err/timestat.mean;
mflops = flops/timestat.mean;
mflops_all.push_back(mflops);
if ( mflops_best == 0 ) mflops_best = mflops;
if ( mflops_worst== 0 ) mflops_worst= mflops;
if ( mflops>mflops_best ) mflops_best = mflops;
if ( mflops<mflops_worst) mflops_worst= mflops;
std::cout<<GridLogMessage << std::fixed << std::setprecision(1)<<"Deo mflop/s = "<< mflops << " ("<<mf_err<<") " << mf_lo<<"-"<<mf_hi <<std::endl;
std::cout<<GridLogMessage << std::fixed << std::setprecision(1)<<"Deo mflop/s per rank "<< mflops/NP<<std::endl;
std::cout<<GridLogMessage << std::fixed << std::setprecision(1)<<"Deo mflop/s per node "<< mflops/NN<<std::endl;
Dw.Report();
Dw.DhopEO(src_o,r_e,DaggerNo);
Dw.DhopOE(src_e,r_o,DaggerNo);
setCheckerboard(r_eo,r_o);
setCheckerboard(r_eo,r_e);
err = r_eo-ref;
std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl;
assert((norm2(err)<1.0e-4));
}
robust = mflops_worst/mflops_best;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << L<<"^4 x "<<Ls<< " Deo Best mflop/s = "<< mflops_best << " ; " << mflops_best/NN<<" per node " <<std::endl;
std::cout<<GridLogMessage << L<<"^4 x "<<Ls<< " Deo Worst mflop/s = "<< mflops_worst<< " ; " << mflops_worst/NN<<" per node " <<std::endl;
std::cout<<GridLogMessage << std::fixed<<std::setprecision(3)<< L<<"^4 x "<<Ls<< " Performance Robustness = "<< robust <<std::endl;
std::cout<<GridLogMessage <<fmt << std::endl;
std::cout<<GridLogMessage ;
for(int i=0;i<mflops_all.size();i++){
std::cout<<mflops_all[i]/NN<<" ; " ;
}
std::cout<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
}
return mflops_best;
}
};
int main (int argc, char ** argv)
{
Grid_init(&argc,&argv);
CartesianCommunicator::SetCommunicatorPolicy(CartesianCommunicator::CommunicatorPolicySequential);
#ifdef KNL
LebesgueOrder::Block = std::vector<int>({8,2,2,2});
#else
LebesgueOrder::Block = std::vector<int>({2,2,2,2});
#endif
Benchmark::Decomposition();
int do_memory=1;
int do_comms =1;
int do_su3 =0;
int do_wilson=1;
int do_dwf =1;
if ( do_su3 ) {
// empty for now
}
#if 1
int sel=2;
std::vector<int> L_list({8,12,16,24});
#else
int sel=1;
std::vector<int> L_list({8,12});
#endif
int selm1=sel-1;
std::vector<double> robust_list;
std::vector<double> wilson;
std::vector<double> dwf4;
std::vector<double> dwf5;
if ( do_wilson ) {
int Ls=1;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Wilson dslash 4D vectorised" <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
for(int l=0;l<L_list.size();l++){
double robust;
wilson.push_back(Benchmark::DWF(1,L_list[l],robust));
}
}
int Ls=16;
if ( do_dwf ) {
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Domain wall dslash 4D vectorised" <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
for(int l=0;l<L_list.size();l++){
double robust;
double result = Benchmark::DWF(Ls,L_list[l],robust) ;
dwf4.push_back(result);
robust_list.push_back(robust);
}
}
if ( do_dwf ) {
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Domain wall dslash 4D vectorised" <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
for(int l=0;l<L_list.size();l++){
dwf5.push_back(Benchmark::DWF5(Ls,L_list[l]));
}
}
if ( do_dwf ) {
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Summary table Ls="<<Ls <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << "L \t\t Wilson \t DWF4 \t DWF5 " <<std::endl;
for(int l=0;l<L_list.size();l++){
std::cout<<GridLogMessage << L_list[l] <<" \t\t "<< wilson[l]<<" \t "<<dwf4[l]<<" \t "<<dwf5[l] <<std::endl;
}
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
}
int NN=NN_global;
if ( do_memory ) {
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Memory benchmark " <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
Benchmark::Memory();
}
if ( do_comms && (NN>1) ) {
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Communications benchmark " <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
Benchmark::Comms();
}
if ( do_dwf ) {
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Per Node Summary table Ls="<<Ls <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L \t\t Wilson\t\t DWF4 \t\t DWF5 " <<std::endl;
for(int l=0;l<L_list.size();l++){
std::cout<<GridLogMessage << L_list[l] <<" \t\t "<< wilson[l]/NN<<" \t "<<dwf4[l]/NN<<" \t "<<dwf5[l] /NN<<std::endl;
}
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
std::cout<<GridLogMessage << " Comparison point result: " << 0.5*(dwf4[sel]+dwf4[selm1])/NN << " Mflop/s per node"<<std::endl;
std::cout<<GridLogMessage << " Comparison point is 0.5*("<<dwf4[sel]/NN<<"+"<<dwf4[selm1]/NN << ") "<<std::endl;
std::cout<<std::setprecision(3);
std::cout<<GridLogMessage << " Comparison point robustness: " << robust_list[sel] <<std::endl;
std::cout<<GridLogMessage << "=================================================================================="<<std::endl;
}
Grid_finalize();
}

View File

@@ -31,6 +31,32 @@ using namespace std;
using namespace Grid; using namespace Grid;
using namespace Grid::QCD; using namespace Grid::QCD;
struct time_statistics{
double mean;
double err;
double min;
double max;
void statistics(std::vector<double> v){
double sum = std::accumulate(v.begin(), v.end(), 0.0);
mean = sum / v.size();
std::vector<double> diff(v.size());
std::transform(v.begin(), v.end(), diff.begin(), [=](double x) { return x - mean; });
double sq_sum = std::inner_product(diff.begin(), diff.end(), diff.begin(), 0.0);
err = std::sqrt(sq_sum / (v.size()*(v.size() - 1)));
auto result = std::minmax_element(v.begin(), v.end());
min = *result.first;
max = *result.second;
}
};
void header(){
std::cout <<GridLogMessage << " L "<<"\t"<<" Ls "<<"\t"
<<std::setw(11)<<"bytes"<<"MB/s uni (err/min/max)"<<"\t\t"<<"MB/s bidi (err/min/max)"<<std::endl;
};
int main (int argc, char ** argv) int main (int argc, char ** argv)
{ {
Grid_init(&argc,&argv); Grid_init(&argc,&argv);
@@ -40,17 +66,21 @@ int main (int argc, char ** argv)
int threads = GridThread::GetThreads(); int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl; std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
int Nloop=10; int Nloop=100;
int nmu=0; int nmu=0;
int maxlat=32;
for(int mu=0;mu<Nd;mu++) if (mpi_layout[mu]>1) nmu++; for(int mu=0;mu<Nd;mu++) if (mpi_layout[mu]>1) nmu++;
std::cout << GridLogMessage << "Number of iterations to average: "<< Nloop << std::endl;
std::vector<double> t_time(Nloop);
time_statistics timestat;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking concurrent halo exchange in "<<nmu<<" dimensions"<<std::endl; std::cout<<GridLogMessage << "= Benchmarking concurrent halo exchange in "<<nmu<<" dimensions"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<" Ls "<<"\t\t"<<"bytes"<<"\t\t"<<"MB/s uni"<<"\t\t"<<"MB/s bidi"<<std::endl; header();
int maxlat=16; for(int lat=4;lat<=maxlat;lat+=4){
for(int lat=4;lat<=maxlat;lat+=2){ for(int Ls=8;Ls<=8;Ls*=2){
for(int Ls=1;Ls<=16;Ls*=2){
std::vector<int> latt_size ({lat*mpi_layout[0], std::vector<int> latt_size ({lat*mpi_layout[0],
lat*mpi_layout[1], lat*mpi_layout[1],
@@ -58,17 +88,25 @@ int main (int argc, char ** argv)
lat*mpi_layout[3]}); lat*mpi_layout[3]});
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
RealD Nrank = Grid._Nprocessors;
RealD Nnode = Grid.NodeCount();
RealD ppn = Nrank/Nnode;
std::vector<std::vector<HalfSpinColourVectorD> > xbuf(8,std::vector<HalfSpinColourVectorD>(lat*lat*lat*Ls)); std::vector<Vector<HalfSpinColourVectorD> > xbuf(8);
std::vector<std::vector<HalfSpinColourVectorD> > rbuf(8,std::vector<HalfSpinColourVectorD>(lat*lat*lat*Ls)); std::vector<Vector<HalfSpinColourVectorD> > rbuf(8);
int ncomm; int ncomm;
int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD); int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD);
for(int mu=0;mu<8;mu++){
xbuf[mu].resize(lat*lat*lat*Ls);
rbuf[mu].resize(lat*lat*lat*Ls);
// std::cout << " buffers " << std::hex << (uint64_t)&xbuf[mu][0] <<" " << (uint64_t)&rbuf[mu][0] <<std::endl;
}
double start=usecond();
for(int i=0;i<Nloop;i++){ for(int i=0;i<Nloop;i++){
double start=usecond();
std::vector<CartesianCommunicator::CommsRequest_t> requests; std::vector<CommsRequest_t> requests;
ncomm=0; ncomm=0;
for(int mu=0;mu<4;mu++){ for(int mu=0;mu<4;mu++){
@@ -79,7 +117,6 @@ int main (int argc, char ** argv)
int comm_proc=1; int comm_proc=1;
int xmit_to_rank; int xmit_to_rank;
int recv_from_rank; int recv_from_rank;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.SendToRecvFromBegin(requests, Grid.SendToRecvFromBegin(requests,
(void *)&xbuf[mu][0], (void *)&xbuf[mu][0],
@@ -102,18 +139,24 @@ int main (int argc, char ** argv)
} }
Grid.SendToRecvFromComplete(requests); Grid.SendToRecvFromComplete(requests);
Grid.Barrier(); Grid.Barrier();
double stop=usecond();
t_time[i] = stop-start; // microseconds
} }
double stop=usecond();
double dbytes = bytes; timestat.statistics(t_time);
double xbytes = Nloop*dbytes*2.0*ncomm;
double dbytes = bytes*ppn;
double xbytes = dbytes*2.0*ncomm;
double rbytes = xbytes; double rbytes = xbytes;
double bidibytes = xbytes+rbytes; double bidibytes = xbytes+rbytes;
double time = stop-start; // microseconds std::cout<<GridLogMessage << std::setw(4) << lat<<"\t"<<Ls<<"\t"
<<std::setw(11) << bytes<< std::fixed << std::setprecision(1) << std::setw(7)
<<std::right<< xbytes/timestat.mean<<" "<< xbytes*timestat.err/(timestat.mean*timestat.mean)<< " "
<<xbytes/timestat.max <<" "<< xbytes/timestat.min
<< "\t\t"<<std::setw(7)<< bidibytes/timestat.mean<< " " << bidibytes*timestat.err/(timestat.mean*timestat.mean) << " "
<< bidibytes/timestat.max << " " << bidibytes/timestat.min << std::endl;
std::cout<<GridLogMessage << lat<<"\t\t"<<Ls<<"\t\t"<<bytes<<"\t\t"<<xbytes/time<<"\t\t"<<bidibytes/time<<std::endl;
} }
} }
@@ -121,25 +164,36 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking sequential halo exchange in "<<nmu<<" dimensions"<<std::endl; std::cout<<GridLogMessage << "= Benchmarking sequential halo exchange in "<<nmu<<" dimensions"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<" Ls "<<"\t\t"<<"bytes"<<"\t\t"<<"MB/s uni"<<"\t\t"<<"MB/s bidi"<<std::endl; header();
for(int lat=4;lat<=maxlat;lat+=4){
for(int Ls=8;Ls<=8;Ls*=2){
for(int lat=4;lat<=maxlat;lat+=2){ std::vector<int> latt_size ({lat*mpi_layout[0],
for(int Ls=1;Ls<=16;Ls*=2){ lat*mpi_layout[1],
lat*mpi_layout[2],
lat*mpi_layout[3]});
std::vector<int> latt_size ({lat,lat,lat,lat});
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
RealD Nrank = Grid._Nprocessors;
RealD Nnode = Grid.NodeCount();
RealD ppn = Nrank/Nnode;
std::vector<std::vector<HalfSpinColourVectorD> > xbuf(8,std::vector<HalfSpinColourVectorD>(lat*lat*lat*Ls)); std::vector<Vector<HalfSpinColourVectorD> > xbuf(8);
std::vector<std::vector<HalfSpinColourVectorD> > rbuf(8,std::vector<HalfSpinColourVectorD>(lat*lat*lat*Ls)); std::vector<Vector<HalfSpinColourVectorD> > rbuf(8);
for(int mu=0;mu<8;mu++){
xbuf[mu].resize(lat*lat*lat*Ls);
rbuf[mu].resize(lat*lat*lat*Ls);
// std::cout << " buffers " << std::hex << (uint64_t)&xbuf[mu][0] <<" " << (uint64_t)&rbuf[mu][0] <<std::endl;
}
int ncomm; int ncomm;
int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD); int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD);
double start=usecond();
for(int i=0;i<Nloop;i++){ for(int i=0;i<Nloop;i++){
double start=usecond();
ncomm=0; ncomm=0;
for(int mu=0;mu<4;mu++){ for(int mu=0;mu<4;mu++){
@@ -152,7 +206,7 @@ int main (int argc, char ** argv)
int recv_from_rank; int recv_from_rank;
{ {
std::vector<CartesianCommunicator::CommsRequest_t> requests; std::vector<CommsRequest_t> requests;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.SendToRecvFromBegin(requests, Grid.SendToRecvFromBegin(requests,
(void *)&xbuf[mu][0], (void *)&xbuf[mu][0],
@@ -165,7 +219,7 @@ int main (int argc, char ** argv)
comm_proc = mpi_layout[mu]-1; comm_proc = mpi_layout[mu]-1;
{ {
std::vector<CartesianCommunicator::CommsRequest_t> requests; std::vector<CommsRequest_t> requests;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.SendToRecvFromBegin(requests, Grid.SendToRecvFromBegin(requests,
(void *)&xbuf[mu+4][0], (void *)&xbuf[mu+4][0],
@@ -178,30 +232,37 @@ int main (int argc, char ** argv)
} }
} }
Grid.Barrier(); Grid.Barrier();
double stop=usecond();
t_time[i] = stop-start; // microseconds
} }
double stop=usecond(); timestat.statistics(t_time);
double dbytes = bytes; double dbytes = bytes*ppn;
double xbytes = Nloop*dbytes*2.0*ncomm; double xbytes = dbytes*2.0*ncomm;
double rbytes = xbytes; double rbytes = xbytes;
double bidibytes = xbytes+rbytes; double bidibytes = xbytes+rbytes;
double time = stop-start; std::cout<<GridLogMessage << std::setw(4) << lat<<"\t"<<Ls<<"\t"
<<std::setw(11) << bytes<< std::fixed << std::setprecision(1) << std::setw(7)
<<std::right<< xbytes/timestat.mean<<" "<< xbytes*timestat.err/(timestat.mean*timestat.mean)<< " "
<<xbytes/timestat.max <<" "<< xbytes/timestat.min
<< "\t\t"<<std::setw(7)<< bidibytes/timestat.mean<< " " << bidibytes*timestat.err/(timestat.mean*timestat.mean) << " "
<< bidibytes/timestat.max << " " << bidibytes/timestat.min << std::endl;
std::cout<<GridLogMessage << lat<<"\t\t"<<Ls<<"\t\t"<<bytes<<"\t\t"<<xbytes/time<<"\t\t"<<bidibytes/time<<std::endl;
} }
} }
Nloop=100;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking concurrent STENCIL halo exchange in "<<nmu<<" dimensions"<<std::endl; std::cout<<GridLogMessage << "= Benchmarking concurrent STENCIL halo exchange in "<<nmu<<" dimensions"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<" Ls "<<"\t\t"<<"bytes"<<"\t\t"<<"MB/s uni"<<"\t\t"<<"MB/s bidi"<<std::endl; header();
for(int lat=4;lat<=maxlat;lat+=2){ for(int lat=4;lat<=maxlat;lat+=4){
for(int Ls=1;Ls<=16;Ls*=2){ for(int Ls=8;Ls<=8;Ls*=2){
std::vector<int> latt_size ({lat*mpi_layout[0], std::vector<int> latt_size ({lat*mpi_layout[0],
lat*mpi_layout[1], lat*mpi_layout[1],
@@ -209,6 +270,9 @@ int main (int argc, char ** argv)
lat*mpi_layout[3]}); lat*mpi_layout[3]});
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
RealD Nrank = Grid._Nprocessors;
RealD Nnode = Grid.NodeCount();
RealD ppn = Nrank/Nnode;
std::vector<HalfSpinColourVectorD *> xbuf(8); std::vector<HalfSpinColourVectorD *> xbuf(8);
std::vector<HalfSpinColourVectorD *> rbuf(8); std::vector<HalfSpinColourVectorD *> rbuf(8);
@@ -216,73 +280,86 @@ int main (int argc, char ** argv)
for(int d=0;d<8;d++){ for(int d=0;d<8;d++){
xbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD)); xbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
rbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD)); rbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)xbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)rbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
} }
int ncomm; int ncomm;
int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD); int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD);
double start=usecond(); double dbytes;
for(int i=0;i<Nloop;i++){ for(int i=0;i<Nloop;i++){
double start=usecond();
std::vector<CartesianCommunicator::CommsRequest_t> requests; dbytes=0;
ncomm=0; ncomm=0;
std::vector<CommsRequest_t> requests;
for(int mu=0;mu<4;mu++){ for(int mu=0;mu<4;mu++){
if (mpi_layout[mu]>1 ) { if (mpi_layout[mu]>1 ) {
ncomm++; ncomm++;
int comm_proc=1; int comm_proc=1;
int xmit_to_rank; int xmit_to_rank;
int recv_from_rank; int recv_from_rank;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.StencilSendToRecvFromBegin(requests, dbytes+=
(void *)&xbuf[mu][0], Grid.StencilSendToRecvFromBegin(requests,
xmit_to_rank, (void *)&xbuf[mu][0],
(void *)&rbuf[mu][0], xmit_to_rank,
recv_from_rank, (void *)&rbuf[mu][0],
bytes); recv_from_rank,
bytes,mu);
comm_proc = mpi_layout[mu]-1; comm_proc = mpi_layout[mu]-1;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.StencilSendToRecvFromBegin(requests, dbytes+=
(void *)&xbuf[mu+4][0], Grid.StencilSendToRecvFromBegin(requests,
xmit_to_rank, (void *)&xbuf[mu+4][0],
(void *)&rbuf[mu+4][0], xmit_to_rank,
recv_from_rank, (void *)&rbuf[mu+4][0],
bytes); recv_from_rank,
bytes,mu+4);
} }
} }
Grid.StencilSendToRecvFromComplete(requests); Grid.StencilSendToRecvFromComplete(requests,0);
Grid.Barrier(); Grid.Barrier();
double stop=usecond();
t_time[i] = stop-start; // microseconds
} }
double stop=usecond();
double dbytes = bytes; timestat.statistics(t_time);
double xbytes = Nloop*dbytes*2.0*ncomm;
double rbytes = xbytes; dbytes=dbytes*ppn;
double bidibytes = xbytes+rbytes; double xbytes = dbytes*0.5;
double rbytes = dbytes*0.5;
double bidibytes = dbytes;
std::cout<<GridLogMessage << std::setw(4) << lat<<"\t"<<Ls<<"\t"
<<std::setw(11) << bytes<< std::fixed << std::setprecision(1) << std::setw(7)
<<std::right<< xbytes/timestat.mean<<" "<< xbytes*timestat.err/(timestat.mean*timestat.mean)<< " "
<<xbytes/timestat.max <<" "<< xbytes/timestat.min
<< "\t\t"<<std::setw(7)<< bidibytes/timestat.mean<< " " << bidibytes*timestat.err/(timestat.mean*timestat.mean) << " "
<< bidibytes/timestat.max << " " << bidibytes/timestat.min << std::endl;
double time = stop-start; // microseconds
std::cout<<GridLogMessage << lat<<"\t\t"<<Ls<<"\t\t"<<bytes<<"\t\t"<<xbytes/time<<"\t\t"<<bidibytes/time<<std::endl;
} }
} }
Nloop=100;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking sequential STENCIL halo exchange in "<<nmu<<" dimensions"<<std::endl; std::cout<<GridLogMessage << "= Benchmarking sequential STENCIL halo exchange in "<<nmu<<" dimensions"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<" Ls "<<"\t\t"<<"bytes"<<"\t\t"<<"MB/s uni"<<"\t\t"<<"MB/s bidi"<<std::endl; header();
for(int lat=4;lat<=maxlat;lat+=2){ for(int lat=4;lat<=maxlat;lat+=4){
for(int Ls=1;Ls<=16;Ls*=2){ for(int Ls=8;Ls<=8;Ls*=2){
std::vector<int> latt_size ({lat*mpi_layout[0], std::vector<int> latt_size ({lat*mpi_layout[0],
lat*mpi_layout[1], lat*mpi_layout[1],
@@ -290,6 +367,9 @@ int main (int argc, char ** argv)
lat*mpi_layout[3]}); lat*mpi_layout[3]});
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
RealD Nrank = Grid._Nprocessors;
RealD Nnode = Grid.NodeCount();
RealD ppn = Nrank/Nnode;
std::vector<HalfSpinColourVectorD *> xbuf(8); std::vector<HalfSpinColourVectorD *> xbuf(8);
std::vector<HalfSpinColourVectorD *> rbuf(8); std::vector<HalfSpinColourVectorD *> rbuf(8);
@@ -297,16 +377,18 @@ int main (int argc, char ** argv)
for(int d=0;d<8;d++){ for(int d=0;d<8;d++){
xbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD)); xbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
rbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD)); rbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)xbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)rbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
} }
int ncomm; int ncomm;
int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD); int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD);
double dbytes;
double start=usecond();
for(int i=0;i<Nloop;i++){ for(int i=0;i<Nloop;i++){
double start=usecond();
std::vector<CartesianCommunicator::CommsRequest_t> requests; std::vector<CommsRequest_t> requests;
dbytes=0;
ncomm=0; ncomm=0;
for(int mu=0;mu<4;mu++){ for(int mu=0;mu<4;mu++){
@@ -318,44 +400,147 @@ int main (int argc, char ** argv)
int recv_from_rank; int recv_from_rank;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.StencilSendToRecvFromBegin(requests, dbytes+=
(void *)&xbuf[mu][0], Grid.StencilSendToRecvFromBegin(requests,
xmit_to_rank, (void *)&xbuf[mu][0],
(void *)&rbuf[mu][0], xmit_to_rank,
recv_from_rank, (void *)&rbuf[mu][0],
bytes); recv_from_rank,
// Grid.StencilSendToRecvFromComplete(requests); bytes,mu);
// requests.resize(0); Grid.StencilSendToRecvFromComplete(requests,mu);
requests.resize(0);
comm_proc = mpi_layout[mu]-1; comm_proc = mpi_layout[mu]-1;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank); Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
Grid.StencilSendToRecvFromBegin(requests, dbytes+=
(void *)&xbuf[mu+4][0], Grid.StencilSendToRecvFromBegin(requests,
xmit_to_rank, (void *)&xbuf[mu+4][0],
(void *)&rbuf[mu+4][0], xmit_to_rank,
recv_from_rank, (void *)&rbuf[mu+4][0],
bytes); recv_from_rank,
Grid.StencilSendToRecvFromComplete(requests); bytes,mu+4);
Grid.StencilSendToRecvFromComplete(requests,mu+4);
requests.resize(0); requests.resize(0);
} }
} }
Grid.Barrier(); Grid.Barrier();
double stop=usecond();
t_time[i] = stop-start; // microseconds
} }
double stop=usecond();
double dbytes = bytes; timestat.statistics(t_time);
double xbytes = Nloop*dbytes*2.0*ncomm;
double rbytes = xbytes;
double bidibytes = xbytes+rbytes;
double time = stop-start; // microseconds dbytes=dbytes*ppn;
double xbytes = dbytes*0.5;
double rbytes = dbytes*0.5;
double bidibytes = dbytes;
std::cout<<GridLogMessage << std::setw(4) << lat<<"\t"<<Ls<<"\t"
<<std::setw(11) << bytes<< std::fixed << std::setprecision(1) << std::setw(7)
<<std::right<< xbytes/timestat.mean<<" "<< xbytes*timestat.err/(timestat.mean*timestat.mean)<< " "
<<xbytes/timestat.max <<" "<< xbytes/timestat.min
<< "\t\t"<<std::setw(7)<< bidibytes/timestat.mean<< " " << bidibytes*timestat.err/(timestat.mean*timestat.mean) << " "
<< bidibytes/timestat.max << " " << bidibytes/timestat.min << std::endl;
std::cout<<GridLogMessage << lat<<"\t\t"<<Ls<<"\t\t"<<bytes<<"\t\t"<<xbytes/time<<"\t\t"<<bidibytes/time<<std::endl;
} }
} }
#ifdef GRID_OMP
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking threaded STENCIL halo exchange in "<<nmu<<" dimensions"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
header();
for(int lat=4;lat<=maxlat;lat+=4){
for(int Ls=8;Ls<=8;Ls*=2){
std::vector<int> latt_size ({lat*mpi_layout[0],
lat*mpi_layout[1],
lat*mpi_layout[2],
lat*mpi_layout[3]});
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
RealD Nrank = Grid._Nprocessors;
RealD Nnode = Grid.NodeCount();
RealD ppn = Nrank/Nnode;
std::vector<HalfSpinColourVectorD *> xbuf(8);
std::vector<HalfSpinColourVectorD *> rbuf(8);
Grid.ShmBufferFreeAll();
for(int d=0;d<8;d++){
xbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
rbuf[d] = (HalfSpinColourVectorD *)Grid.ShmBufferMalloc(lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)xbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
bzero((void *)rbuf[d],lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD));
}
int ncomm;
int bytes=lat*lat*lat*Ls*sizeof(HalfSpinColourVectorD);
double dbytes;
for(int i=0;i<Nloop;i++){
double start=usecond();
std::vector<CommsRequest_t> requests;
dbytes=0;
ncomm=0;
#pragma omp parallel for num_threads(Grid::CartesianCommunicator::nCommThreads)
for(int dir=0;dir<8;dir++){
double tbytes;
int mu =dir % 4;
if (mpi_layout[mu]>1 ) {
ncomm++;
int xmit_to_rank;
int recv_from_rank;
if ( dir == mu ) {
int comm_proc=1;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
} else {
int comm_proc = mpi_layout[mu]-1;
Grid.ShiftedRanks(mu,comm_proc,xmit_to_rank,recv_from_rank);
}
int tid = omp_get_thread_num();
tbytes= Grid.StencilSendToRecvFrom((void *)&xbuf[dir][0], xmit_to_rank,
(void *)&rbuf[dir][0], recv_from_rank, bytes,tid);
#pragma omp atomic
dbytes+=tbytes;
}
}
Grid.Barrier();
double stop=usecond();
t_time[i] = stop-start; // microseconds
}
timestat.statistics(t_time);
dbytes=dbytes*ppn;
double xbytes = dbytes*0.5;
double rbytes = dbytes*0.5;
double bidibytes = dbytes;
std::cout<<GridLogMessage << std::setw(4) << lat<<"\t"<<Ls<<"\t"
<<std::setw(11) << bytes<< std::fixed << std::setprecision(1) << std::setw(7)
<<std::right<< xbytes/timestat.mean<<" "<< xbytes*timestat.err/(timestat.mean*timestat.mean)<< " "
<<xbytes/timestat.max <<" "<< xbytes/timestat.min
<< "\t\t"<<std::setw(7)<< bidibytes/timestat.mean<< " " << bidibytes*timestat.err/(timestat.mean*timestat.mean) << " "
<< bidibytes/timestat.max << " " << bidibytes/timestat.min << std::endl;
}
}
#endif
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= All done; Bye Bye"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
Grid_finalize(); Grid_finalize();
} }

View File

@@ -1,28 +1,22 @@
/************************************************************************************* /*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid Grid physics library, www.github.com/paboyle/Grid
Source file: ./benchmarks/Benchmark_dwf.cc Source file: ./benchmarks/Benchmark_dwf.cc
Copyright (C) 2015 Copyright (C) 2015
Author: Peter Boyle <paboyle@ph.ed.ac.uk> Author: Peter Boyle <paboyle@ph.ed.ac.uk>
Author: paboyle <paboyle@ph.ed.ac.uk> Author: paboyle <paboyle@ph.ed.ac.uk>
This program is free software; you can redistribute it and/or modify This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or the Free Software Foundation; either version 2 of the License, or
(at your option) any later version. (at your option) any later version.
This program is distributed in the hope that it will be useful, This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details. GNU General Public License for more details.
You should have received a copy of the GNU General Public License along You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc., with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/ *************************************************************************************/
/* END LEGAL */ /* END LEGAL */
@@ -37,27 +31,36 @@ struct scal {
d internal; d internal;
}; };
Gamma::GammaMatrix Gmu [] = { Gamma::Algebra Gmu [] = {
Gamma::GammaX, Gamma::Algebra::GammaX,
Gamma::GammaY, Gamma::Algebra::GammaY,
Gamma::GammaZ, Gamma::Algebra::GammaZ,
Gamma::GammaT Gamma::Algebra::GammaT
}; };
typedef WilsonFermion5D<DomainWallVec5dImplR> WilsonFermion5DR; typedef WilsonFermion5D<DomainWallVec5dImplR> WilsonFermion5DR;
typedef WilsonFermion5D<DomainWallVec5dImplF> WilsonFermion5DF; typedef WilsonFermion5D<DomainWallVec5dImplF> WilsonFermion5DF;
typedef WilsonFermion5D<DomainWallVec5dImplD> WilsonFermion5DD; typedef WilsonFermion5D<DomainWallVec5dImplD> WilsonFermion5DD;
int main (int argc, char ** argv) int main (int argc, char ** argv)
{ {
Grid_init(&argc,&argv); Grid_init(&argc,&argv);
int threads = GridThread::GetThreads(); int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
std::vector<int> latt4 = GridDefaultLatt(); std::vector<int> latt4 = GridDefaultLatt();
const int Ls=8; int Ls=16;
for(int i=0;i<argc;i++)
if(std::string(argv[i]) == "-Ls"){
std::stringstream ss(argv[i+1]); ss >> Ls;
}
GridLogLayout();
long unsigned int single_site_flops = 8*QCD::Nc*(7+16*QCD::Nc);
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt(), GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi()); GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt(), GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid); GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid); GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid);
@@ -72,34 +75,65 @@ int main (int argc, char ** argv)
std::vector<int> seeds4({1,2,3,4}); std::vector<int> seeds4({1,2,3,4});
std::vector<int> seeds5({5,6,7,8}); std::vector<int> seeds5({5,6,7,8});
std::cout << GridLogMessage << "Initialising 4d RNG" << std::endl;
GridParallelRNG RNG4(UGrid); RNG4.SeedFixedIntegers(seeds4); GridParallelRNG RNG4(UGrid); RNG4.SeedFixedIntegers(seeds4);
std::cout << GridLogMessage << "Initialising 5d RNG" << std::endl;
GridParallelRNG RNG5(FGrid); RNG5.SeedFixedIntegers(seeds5); GridParallelRNG RNG5(FGrid); RNG5.SeedFixedIntegers(seeds5);
std::cout << GridLogMessage << "Initialised RNGs" << std::endl;
LatticeFermion src (FGrid); random(RNG5,src); LatticeFermion src (FGrid); random(RNG5,src);
#if 0
src = zero;
{
std::vector<int> origin({0,0,0,latt4[2]-1,0});
SpinColourVectorF tmp;
tmp=zero;
tmp()(0)(0)=Complex(-2.0,0.0);
std::cout << " source site 0 " << tmp<<std::endl;
pokeSite(tmp,src,origin);
}
#else
RealD N2 = 1.0/::sqrt(norm2(src));
src = src*N2;
#endif
LatticeFermion result(FGrid); result=zero; LatticeFermion result(FGrid); result=zero;
LatticeFermion ref(FGrid); ref=zero; LatticeFermion ref(FGrid); ref=zero;
LatticeFermion tmp(FGrid); LatticeFermion tmp(FGrid);
LatticeFermion err(FGrid); LatticeFermion err(FGrid);
std::cout << GridLogMessage << "Drawing gauge field" << std::endl;
LatticeGaugeField Umu(UGrid); LatticeGaugeField Umu(UGrid);
random(RNG4,Umu); SU3::HotConfiguration(RNG4,Umu);
std::cout << GridLogMessage << "Random gauge initialised " << std::endl;
LatticeGaugeField Umu5d(FGrid); #if 0
Umu=1.0;
for(int mu=0;mu<Nd;mu++){
LatticeColourMatrix ttmp(UGrid);
ttmp = PeekIndex<LorentzIndex>(Umu,mu);
// if (mu !=2 ) ttmp = 0;
// ttmp = ttmp* pow(10.0,mu);
PokeIndex<LorentzIndex>(Umu,ttmp,mu);
}
std::cout << GridLogMessage << "Forced to diagonal " << std::endl;
#endif
////////////////////////////////////
// Naive wilson implementation
////////////////////////////////////
// replicate across fifth dimension // replicate across fifth dimension
LatticeGaugeField Umu5d(FGrid);
std::vector<LatticeColourMatrix> U(4,FGrid);
for(int ss=0;ss<Umu._grid->oSites();ss++){ for(int ss=0;ss<Umu._grid->oSites();ss++){
for(int s=0;s<Ls;s++){ for(int s=0;s<Ls;s++){
Umu5d._odata[Ls*ss+s] = Umu._odata[ss]; Umu5d._odata[Ls*ss+s] = Umu._odata[ss];
} }
} }
////////////////////////////////////
// Naive wilson implementation
////////////////////////////////////
std::vector<LatticeColourMatrix> U(4,FGrid);
for(int mu=0;mu<Nd;mu++){ for(int mu=0;mu<Nd;mu++){
U[mu] = PeekIndex<LorentzIndex>(Umu5d,mu); U[mu] = PeekIndex<LorentzIndex>(Umu5d,mu);
} }
std::cout << GridLogMessage << "Setting up Cshift based reference " << std::endl;
if (1) if (1)
{ {
@@ -120,8 +154,7 @@ int main (int argc, char ** argv)
RealD M5 =1.8; RealD M5 =1.8;
RealD NP = UGrid->_Nprocessors; RealD NP = UGrid->_Nprocessors;
RealD NN = UGrid->NodeCount();
DomainWallFermionR Dw(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl; std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl;
@@ -131,15 +164,22 @@ int main (int argc, char ** argv)
std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplex::Nsimd()<<std::endl; std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplex::Nsimd()<<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl; if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl; if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
#ifdef GRID_OMP
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
#endif
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
int ncall =100; DomainWallFermionR Dw(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
int ncall =500;
if (1) { if (1) {
FGrid->Barrier(); FGrid->Barrier();
Dw.ZeroCounters(); Dw.ZeroCounters();
Dw.Dhop(src,result,0);
std::cout<<GridLogMessage<<"Called warmup"<<std::endl;
double t0=usecond(); double t0=usecond();
for(int i=0;i<ncall;i++){ for(int i=0;i<ncall;i++){
__SSC_START; __SSC_START;
@@ -150,19 +190,58 @@ int main (int argc, char ** argv)
FGrid->Barrier(); FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=1344*volume*ncall; double flops=single_site_flops*volume*ncall;
std::cout<<GridLogMessage << "Called Dw "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl; std::cout<<GridLogMessage << "Called Dw "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl;
std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl; // std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl;
std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl; // std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl; std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl; std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "mflop/s per node = "<< flops/(t1-t0)/NN<<std::endl;
err = ref-result; err = ref-result;
std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl; std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl;
/*
if(( norm2(err)>1.0e-4) ) {
std::cout << "RESULT\n " << result<<std::endl;
std::cout << "REF \n " << ref <<std::endl;
std::cout << "ERR \n " << err <<std::endl;
FGrid->Barrier();
exit(-1);
}
*/
assert (norm2(err)< 1.0e-4 ); assert (norm2(err)< 1.0e-4 );
Dw.Report(); Dw.Report();
} }
DomainWallFermionRL DwH(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
if (1) {
FGrid->Barrier();
DwH.ZeroCounters();
DwH.Dhop(src,result,0);
double t0=usecond();
for(int i=0;i<ncall;i++){
__SSC_START;
DwH.Dhop(src,result,0);
__SSC_STOP;
}
double t1=usecond();
FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=single_site_flops*volume*ncall;
std::cout<<GridLogMessage << "Called half prec comms Dw "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "mflop/s per node = "<< flops/(t1-t0)/NN<<std::endl;
err = ref-result;
std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl;
assert (norm2(err)< 1.0e-3 );
DwH.Report();
}
if (1) if (1)
{ {
@@ -171,6 +250,10 @@ int main (int argc, char ** argv)
std::cout << GridLogMessage<< "* Vectorising fifth dimension by "<<vComplex::Nsimd()<<std::endl; std::cout << GridLogMessage<< "* Vectorising fifth dimension by "<<vComplex::Nsimd()<<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl; if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl; if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
#ifdef GRID_OMP
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
#endif
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
@@ -183,20 +266,12 @@ int main (int argc, char ** argv)
WilsonFermion5DR sDw(Umu,*sFGrid,*sFrbGrid,*sUGrid,*sUrbGrid,M5); WilsonFermion5DR sDw(Umu,*sFGrid,*sFrbGrid,*sUGrid,*sUrbGrid,M5);
for(int x=0;x<latt4[0];x++){ localConvert(src,ssrc);
for(int y=0;y<latt4[1];y++){
for(int z=0;z<latt4[2];z++){
for(int t=0;t<latt4[3];t++){
for(int s=0;s<Ls;s++){
std::vector<int> site({s,x,y,z,t});
SpinColourVector tmp;
peekSite(tmp,src,site);
pokeSite(tmp,ssrc,site);
}}}}}
std::cout<<GridLogMessage<< "src norms "<< norm2(src)<<" " <<norm2(ssrc)<<std::endl; std::cout<<GridLogMessage<< "src norms "<< norm2(src)<<" " <<norm2(ssrc)<<std::endl;
FGrid->Barrier(); FGrid->Barrier();
double t0=usecond(); sDw.Dhop(ssrc,sresult,0);
sDw.ZeroCounters(); sDw.ZeroCounters();
double t0=usecond();
for(int i=0;i<ncall;i++){ for(int i=0;i<ncall;i++){
__SSC_START; __SSC_START;
sDw.Dhop(ssrc,sresult,0); sDw.Dhop(ssrc,sresult,0);
@@ -205,51 +280,58 @@ int main (int argc, char ** argv)
double t1=usecond(); double t1=usecond();
FGrid->Barrier(); FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=1344*volume*ncall; double flops=single_site_flops*volume*ncall;
std::cout<<GridLogMessage << "Called Dw s_inner "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl; std::cout<<GridLogMessage << "Called Dw s_inner "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl; std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl; std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "mflop/s per node = "<< flops/(t1-t0)/NN<<std::endl;
// std::cout<<GridLogMessage<< "res norms "<< norm2(result)<<" " <<norm2(sresult)<<std::endl;
sDw.Report(); sDw.Report();
if(0){
for(int i=0;i< PerformanceCounter::NumTypes(); i++ ){
sDw.Dhop(ssrc,sresult,0);
PerformanceCounter Counter(i);
Counter.Start();
sDw.Dhop(ssrc,sresult,0);
Counter.Stop();
Counter.Report();
}
}
std::cout<<GridLogMessage<< "res norms "<< norm2(result)<<" " <<norm2(sresult)<<std::endl;
RealD sum=0; RealD sum=0;
for(int x=0;x<latt4[0];x++){
for(int y=0;y<latt4[1];y++){ err=zero;
for(int z=0;z<latt4[2];z++){ localConvert(sresult,err);
for(int t=0;t<latt4[3];t++){ err = err - ref;
for(int s=0;s<Ls;s++){ sum = norm2(err);
std::vector<int> site({s,x,y,z,t}); std::cout<<GridLogMessage<<" difference between normal ref and simd is "<<sum<<std::endl;
SpinColourVector normal, simd; if(sum > 1.0e-4 ){
peekSite(normal,result,site); std::cout<< "sD REF\n " <<ref << std::endl;
peekSite(simd,sresult,site); std::cout<< "sD ERR \n " <<err <<std::endl;
sum=sum+norm2(normal-simd); }
if (norm2(normal-simd) > 1.0e-6 ) { // assert(sum < 1.0e-4);
std::cout << "site "<<x<<","<<y<<","<<z<<","<<t<<","<<s<<" "<<norm2(normal-simd)<<std::endl;
std::cout << "site "<<x<<","<<y<<","<<z<<","<<t<<","<<s<<" normal "<<normal<<std::endl; err=zero;
std::cout << "site "<<x<<","<<y<<","<<z<<","<<t<<","<<s<<" simd "<<simd<<std::endl; localConvert(sresult,err);
} err = err - result;
}}}}} sum = norm2(err);
std::cout<<GridLogMessage<<" difference between normal and simd is "<<sum<<std::endl; std::cout<<GridLogMessage<<" difference between normal result and simd is "<<sum<<std::endl;
assert (sum< 1.0e-4 ); if(sum > 1.0e-4 ){
std::cout<< "sD REF\n " <<result << std::endl;
std::cout<< "sD ERR \n " << err <<std::endl;
}
assert(sum < 1.0e-4);
if (1) { if(1){
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking WilsonFermion5D<DomainWallVec5dImplR>::DhopEO "<<std::endl;
std::cout << GridLogMessage<< "* Vectorising fifth dimension by "<<vComplex::Nsimd()<<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
#ifdef GRID_OMP
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
#endif
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric )
std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll)
std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm )
std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
LatticeFermion sr_eo(sFGrid); LatticeFermion sr_eo(sFGrid);
LatticeFermion ssrc_e (sFrbGrid); LatticeFermion ssrc_e (sFrbGrid);
LatticeFermion ssrc_o (sFrbGrid); LatticeFermion ssrc_o (sFrbGrid);
LatticeFermion sr_e (sFrbGrid); LatticeFermion sr_e (sFrbGrid);
@@ -257,39 +339,30 @@ int main (int argc, char ** argv)
pickCheckerboard(Even,ssrc_e,ssrc); pickCheckerboard(Even,ssrc_e,ssrc);
pickCheckerboard(Odd,ssrc_o,ssrc); pickCheckerboard(Odd,ssrc_o,ssrc);
// setCheckerboard(sr_eo,ssrc_o);
setCheckerboard(sr_eo,ssrc_o); // setCheckerboard(sr_eo,ssrc_e);
setCheckerboard(sr_eo,ssrc_e);
sr_e = zero; sr_e = zero;
sr_o = zero; sr_o = zero;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking WilsonFermion5D<DomainWallVec5dImplR>::DhopEO "<<std::endl;
std::cout << GridLogMessage<< "* Vectorising fifth dimension by "<<vComplex::Nsimd()<<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
FGrid->Barrier(); FGrid->Barrier();
sDw.DhopEO(ssrc_o, sr_e, DaggerNo);
sDw.ZeroCounters(); sDw.ZeroCounters();
sDw.stat.init("DhopEO"); // sDw.stat.init("DhopEO");
double t0=usecond(); double t0=usecond();
for (int i = 0; i < ncall; i++) { for (int i = 0; i < ncall; i++) {
sDw.DhopEO(ssrc_o, sr_e, DaggerNo); sDw.DhopEO(ssrc_o, sr_e, DaggerNo);
} }
double t1=usecond(); double t1=usecond();
FGrid->Barrier(); FGrid->Barrier();
sDw.stat.print(); // sDw.stat.print();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=(1344.0*volume*ncall)/2; double flops=(single_site_flops*volume*ncall)/2.0;
std::cout<<GridLogMessage << "sDeo mflop/s = "<< flops/(t1-t0)<<std::endl; std::cout<<GridLogMessage << "sDeo mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "sDeo mflop/s per rank "<< flops/(t1-t0)/NP<<std::endl; std::cout<<GridLogMessage << "sDeo mflop/s per rank "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "sDeo mflop/s per node "<< flops/(t1-t0)/NN<<std::endl;
sDw.Report(); sDw.Report();
sDw.DhopEO(ssrc_o,sr_e,DaggerNo); sDw.DhopEO(ssrc_o,sr_e,DaggerNo);
@@ -298,51 +371,75 @@ int main (int argc, char ** argv)
pickCheckerboard(Even,ssrc_e,sresult); pickCheckerboard(Even,ssrc_e,sresult);
pickCheckerboard(Odd ,ssrc_o,sresult); pickCheckerboard(Odd ,ssrc_o,sresult);
ssrc_e = ssrc_e - sr_e; ssrc_e = ssrc_e - sr_e;
RealD error = norm2(ssrc_e); RealD error = norm2(ssrc_e);
std::cout<<GridLogMessage << "sE norm diff "<< norm2(ssrc_e)<< " vec nrm"<<norm2(sr_e) <<std::endl; std::cout<<GridLogMessage << "sE norm diff "<< norm2(ssrc_e)<< " vec nrm"<<norm2(sr_e) <<std::endl;
ssrc_o = ssrc_o - sr_o;
ssrc_o = ssrc_o - sr_o;
error+= norm2(ssrc_o); error+= norm2(ssrc_o);
std::cout<<GridLogMessage << "sO norm diff "<< norm2(ssrc_o)<< " vec nrm"<<norm2(sr_o) <<std::endl; std::cout<<GridLogMessage << "sO norm diff "<< norm2(ssrc_o)<< " vec nrm"<<norm2(sr_o) <<std::endl;
if(error>1.0e-4) {
if(( error>1.0e-4) ) {
setCheckerboard(ssrc,ssrc_o); setCheckerboard(ssrc,ssrc_o);
setCheckerboard(ssrc,ssrc_e); setCheckerboard(ssrc,ssrc_e);
std::cout<< ssrc << std::endl; std::cout<< "DIFF\n " <<ssrc << std::endl;
setCheckerboard(ssrc,sr_o);
setCheckerboard(ssrc,sr_e);
std::cout<< "CBRESULT\n " <<ssrc << std::endl;
std::cout<< "RESULT\n " <<sresult<< std::endl;
} }
assert(error<1.0e-4);
} }
if(0){
std::cout << "Single cache warm call to sDw.Dhop " <<std::endl;
for(int i=0;i< PerformanceCounter::NumTypes(); i++ ){
sDw.Dhop(ssrc,sresult,0);
PerformanceCounter Counter(i);
Counter.Start();
sDw.Dhop(ssrc,sresult,0);
Counter.Stop();
Counter.Report();
}
}
} }
if (1) if (1)
{ // Naive wilson dag implementation { // Naive wilson dag implementation
ref = zero; ref = zero;
for(int mu=0;mu<Nd;mu++){ for(int mu=0;mu<Nd;mu++){
// ref = src - Gamma(Gamma::GammaX)* src ; // 1+gamma_x // ref = src - Gamma(Gamma::Algebra::GammaX)* src ; // 1+gamma_x
tmp = U[mu]*Cshift(src,mu+1,1); tmp = U[mu]*Cshift(src,mu+1,1);
for(int i=0;i<ref._odata.size();i++){ for(int i=0;i<ref._odata.size();i++){
ref._odata[i]+= tmp._odata[i] + Gamma(Gmu[mu])*tmp._odata[i]; ; ref._odata[i]+= tmp._odata[i] + Gamma(Gmu[mu])*tmp._odata[i]; ;
} }
tmp =adj(U[mu])*src; tmp =adj(U[mu])*src;
tmp =Cshift(tmp,mu+1,-1); tmp =Cshift(tmp,mu+1,-1);
for(int i=0;i<ref._odata.size();i++){ for(int i=0;i<ref._odata.size();i++){
ref._odata[i]+= tmp._odata[i] - Gamma(Gmu[mu])*tmp._odata[i]; ; ref._odata[i]+= tmp._odata[i] - Gamma(Gmu[mu])*tmp._odata[i]; ;
} }
} }
ref = -0.5*ref; ref = -0.5*ref;
} }
// dump=1;
Dw.Dhop(src,result,1); Dw.Dhop(src,result,1);
std::cout << GridLogMessage << "Compare to naive wilson implementation Dag to verify correctness" << std::endl; std::cout << GridLogMessage << "Compare to naive wilson implementation Dag to verify correctness" << std::endl;
std::cout<<GridLogMessage << "Called DwDag"<<std::endl; std::cout<<GridLogMessage << "Called DwDag"<<std::endl;
std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl; std::cout<<GridLogMessage << "norm dag result "<< norm2(result)<<std::endl;
std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl; std::cout<<GridLogMessage << "norm dag ref "<< norm2(ref)<<std::endl;
err = ref-result; err = ref-result;
std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl; std::cout<<GridLogMessage << "norm dag diff "<< norm2(err)<<std::endl;
assert(norm2(err)<1.0e-4); if((norm2(err)>1.0e-4)){
std::cout<< "DAG RESULT\n " <<ref << std::endl;
std::cout<< "DAG sRESULT\n " <<result << std::endl;
std::cout<< "DAG ERR \n " << err <<std::endl;
}
LatticeFermion src_e (FrbGrid); LatticeFermion src_e (FrbGrid);
LatticeFermion src_o (FrbGrid); LatticeFermion src_o (FrbGrid);
LatticeFermion r_e (FrbGrid); LatticeFermion r_e (FrbGrid);
@@ -350,18 +447,24 @@ int main (int argc, char ** argv)
LatticeFermion r_eo (FGrid); LatticeFermion r_eo (FGrid);
std::cout<<GridLogMessage << "Calling Deo and Doe and assert Deo+Doe == Dunprec"<<std::endl; std::cout<<GridLogMessage << "Calling Deo and Doe and //assert Deo+Doe == Dunprec"<<std::endl;
pickCheckerboard(Even,src_e,src); pickCheckerboard(Even,src_e,src);
pickCheckerboard(Odd,src_o,src); pickCheckerboard(Odd,src_o,src);
std::cout<<GridLogMessage << "src_e"<<norm2(src_e)<<std::endl; std::cout<<GridLogMessage << "src_e"<<norm2(src_e)<<std::endl;
std::cout<<GridLogMessage << "src_o"<<norm2(src_o)<<std::endl; std::cout<<GridLogMessage << "src_o"<<norm2(src_o)<<std::endl;
// S-direction is INNERMOST and takes no part in the parity.
std::cout << GridLogMessage<< "*********************************************************" <<std::endl; std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking DomainWallFermionR::DhopEO "<<std::endl; std::cout << GridLogMessage<< "* Benchmarking DomainWallFermionR::DhopEO "<<std::endl;
std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplex::Nsimd()<<std::endl; std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplex::Nsimd()<<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl; if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl; if ( sizeof(Real)==8 ) std::cout << GridLogMessage<< "* DOUBLE precision "<<std::endl;
#ifdef GRID_OMP
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
#endif
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
@@ -369,6 +472,7 @@ int main (int argc, char ** argv)
{ {
Dw.ZeroCounters(); Dw.ZeroCounters();
FGrid->Barrier(); FGrid->Barrier();
Dw.DhopEO(src_o,r_e,DaggerNo);
double t0=usecond(); double t0=usecond();
for(int i=0;i<ncall;i++){ for(int i=0;i<ncall;i++){
Dw.DhopEO(src_o,r_e,DaggerNo); Dw.DhopEO(src_o,r_e,DaggerNo);
@@ -377,10 +481,11 @@ int main (int argc, char ** argv)
FGrid->Barrier(); FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=(1344.0*volume*ncall)/2; double flops=(single_site_flops*volume*ncall)/2.0;
std::cout<<GridLogMessage << "Deo mflop/s = "<< flops/(t1-t0)<<std::endl; std::cout<<GridLogMessage << "Deo mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "Deo mflop/s per rank "<< flops/(t1-t0)/NP<<std::endl; std::cout<<GridLogMessage << "Deo mflop/s per rank "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "Deo mflop/s per node "<< flops/(t1-t0)/NN<<std::endl;
Dw.Report(); Dw.Report();
} }
Dw.DhopEO(src_o,r_e,DaggerNo); Dw.DhopEO(src_o,r_e,DaggerNo);
@@ -396,14 +501,20 @@ int main (int argc, char ** argv)
err = r_eo-result; err = r_eo-result;
std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl; std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl;
assert(norm2(err)<1.0e-4); if((norm2(err)>1.0e-4)){
std::cout<< "Deo RESULT\n " <<r_eo << std::endl;
std::cout<< "Deo REF\n " <<result << std::endl;
std::cout<< "Deo ERR \n " << err <<std::endl;
}
pickCheckerboard(Even,src_e,err); pickCheckerboard(Even,src_e,err);
pickCheckerboard(Odd,src_o,err); pickCheckerboard(Odd,src_o,err);
std::cout<<GridLogMessage << "norm diff even "<< norm2(src_e)<<std::endl; std::cout<<GridLogMessage << "norm diff even "<< norm2(src_e)<<std::endl;
std::cout<<GridLogMessage << "norm diff odd "<< norm2(src_o)<<std::endl; std::cout<<GridLogMessage << "norm diff odd "<< norm2(src_o)<<std::endl;
assert(norm2(src_e)<1.0e-4); assert(norm2(src_e)<1.0e-4);
assert(norm2(src_o)<1.0e-4); assert(norm2(src_o)<1.0e-4);
Grid_finalize(); Grid_finalize();
exit(0);
} }

View File

@@ -37,11 +37,11 @@ struct scal {
d internal; d internal;
}; };
Gamma::GammaMatrix Gmu [] = { Gamma::Algebra Gmu [] = {
Gamma::GammaX, Gamma::Algebra::GammaX,
Gamma::GammaY, Gamma::Algebra::GammaY,
Gamma::GammaZ, Gamma::Algebra::GammaZ,
Gamma::GammaT Gamma::Algebra::GammaT
}; };
void benchDw(std::vector<int> & L, int Ls, int threads, int report =0 ); void benchDw(std::vector<int> & L, int Ls, int threads, int report =0 );
@@ -51,6 +51,7 @@ int main (int argc, char ** argv)
{ {
Grid_init(&argc,&argv); Grid_init(&argc,&argv);
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl; std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
@@ -107,6 +108,7 @@ void benchDw(std::vector<int> & latt4, int Ls, int threads,int report )
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid); GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid); GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid);
GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid); GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid);
long unsigned int single_site_flops = 8*QCD::Nc*(7+16*QCD::Nc);
std::vector<int> seeds4({1,2,3,4}); std::vector<int> seeds4({1,2,3,4});
std::vector<int> seeds5({5,6,7,8}); std::vector<int> seeds5({5,6,7,8});
@@ -196,7 +198,7 @@ void benchDw(std::vector<int> & latt4, int Ls, int threads,int report )
if ( ! report ) { if ( ! report ) {
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=1344*volume*ncall; double flops=single_site_flops*volume*ncall;
std::cout <<"\t"<<NP<< "\t"<<flops/(t1-t0)<< "\t"; std::cout <<"\t"<<NP<< "\t"<<flops/(t1-t0)<< "\t";
} }
@@ -228,7 +230,7 @@ void benchDw(std::vector<int> & latt4, int Ls, int threads,int report )
if(!report){ if(!report){
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=(1344.0*volume*ncall)/2; double flops=(single_site_flops*volume*ncall)/2.0;
std::cout<< flops/(t1-t0); std::cout<< flops/(t1-t0);
} }
} }
@@ -237,6 +239,7 @@ void benchDw(std::vector<int> & latt4, int Ls, int threads,int report )
#define CHECK_SDW #define CHECK_SDW
void benchsDw(std::vector<int> & latt4, int Ls, int threads, int report ) void benchsDw(std::vector<int> & latt4, int Ls, int threads, int report )
{ {
long unsigned int single_site_flops = 8*QCD::Nc*(7+16*QCD::Nc);
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(latt4, GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi()); GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(latt4, GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid); GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
@@ -321,7 +324,7 @@ void benchsDw(std::vector<int> & latt4, int Ls, int threads, int report )
Counter.Report(); Counter.Report();
} else { } else {
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=1344*volume*ncall; double flops=single_site_flops*volume*ncall;
std::cout<<"\t"<< flops/(t1-t0); std::cout<<"\t"<< flops/(t1-t0);
} }
@@ -358,7 +361,7 @@ void benchsDw(std::vector<int> & latt4, int Ls, int threads, int report )
CounterSdw.Report(); CounterSdw.Report();
} else { } else {
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu]; double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=(1344.0*volume*ncall)/2; double flops=(single_site_flops*volume*ncall)/2.0;
std::cout<<"\t"<< flops/(t1-t0); std::cout<<"\t"<< flops/(t1-t0);
} }
} }

View File

@@ -0,0 +1,190 @@
#include <Grid/Grid.h>
#include <sstream>
using namespace std;
using namespace Grid;
using namespace Grid::QCD;
template<class d>
struct scal {
d internal;
};
Gamma::Algebra Gmu [] = {
Gamma::Algebra::GammaX,
Gamma::Algebra::GammaY,
Gamma::Algebra::GammaZ,
Gamma::Algebra::GammaT
};
typedef typename GparityDomainWallFermionF::FermionField GparityLatticeFermionF;
typedef typename GparityDomainWallFermionD::FermionField GparityLatticeFermionD;
int main (int argc, char ** argv)
{
Grid_init(&argc,&argv);
int Ls=16;
for(int i=0;i<argc;i++)
if(std::string(argv[i]) == "-Ls"){
std::stringstream ss(argv[i+1]); ss >> Ls;
}
int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
std::cout<<GridLogMessage << "Ls = " << Ls << std::endl;
std::vector<int> latt4 = GridDefaultLatt();
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt(), GridDefaultSimd(Nd,vComplexF::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid);
GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid);
std::vector<int> seeds4({1,2,3,4});
std::vector<int> seeds5({5,6,7,8});
std::cout << GridLogMessage << "Initialising 4d RNG" << std::endl;
GridParallelRNG RNG4(UGrid); RNG4.SeedFixedIntegers(seeds4);
std::cout << GridLogMessage << "Initialising 5d RNG" << std::endl;
GridParallelRNG RNG5(FGrid); RNG5.SeedFixedIntegers(seeds5);
std::cout << GridLogMessage << "Initialised RNGs" << std::endl;
GparityLatticeFermionF src (FGrid); random(RNG5,src);
RealD N2 = 1.0/::sqrt(norm2(src));
src = src*N2;
GparityLatticeFermionF result(FGrid); result=zero;
GparityLatticeFermionF ref(FGrid); ref=zero;
GparityLatticeFermionF tmp(FGrid);
GparityLatticeFermionF err(FGrid);
std::cout << GridLogMessage << "Drawing gauge field" << std::endl;
LatticeGaugeFieldF Umu(UGrid);
SU3::HotConfiguration(RNG4,Umu);
std::cout << GridLogMessage << "Random gauge initialised " << std::endl;
RealD mass=0.1;
RealD M5 =1.8;
RealD NP = UGrid->_Nprocessors;
RealD NN = UGrid->NodeCount();
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking DomainWallFermion::Dhop "<<std::endl;
std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplexF::Nsimd()<<std::endl;
#ifdef GRID_OMP
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsAndCompute ) std::cout << GridLogMessage<< "* Using Overlapped Comms/Compute" <<std::endl;
if ( WilsonKernelsStatic::Comms == WilsonKernelsStatic::CommsThenCompute) std::cout << GridLogMessage<< "* Using sequential comms compute" <<std::endl;
#endif
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "* SINGLE/SINGLE"<<std::endl;
GparityDomainWallFermionF Dw(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
int ncall =1000;
if (1) {
FGrid->Barrier();
Dw.ZeroCounters();
Dw.Dhop(src,result,0);
std::cout<<GridLogMessage<<"Called warmup"<<std::endl;
double t0=usecond();
for(int i=0;i<ncall;i++){
__SSC_START;
Dw.Dhop(src,result,0);
__SSC_STOP;
}
double t1=usecond();
FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=2*1320*volume*ncall;
std::cout<<GridLogMessage << "Called Dw "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl;
// std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl;
// std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "mflop/s per node = "<< flops/(t1-t0)/NN<<std::endl;
Dw.Report();
}
std::cout << GridLogMessage<< "* SINGLE/HALF"<<std::endl;
GparityDomainWallFermionFH DwH(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
if (1) {
FGrid->Barrier();
DwH.ZeroCounters();
DwH.Dhop(src,result,0);
double t0=usecond();
for(int i=0;i<ncall;i++){
__SSC_START;
DwH.Dhop(src,result,0);
__SSC_STOP;
}
double t1=usecond();
FGrid->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=2*1320*volume*ncall;
std::cout<<GridLogMessage << "Called half prec comms Dw "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "mflop/s per node = "<< flops/(t1-t0)/NN<<std::endl;
DwH.Report();
}
GridCartesian * UGrid_d = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt(), GridDefaultSimd(Nd,vComplexD::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid_d = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid_d);
GridCartesian * FGrid_d = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid_d);
GridRedBlackCartesian * FrbGrid_d = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid_d);
std::cout << GridLogMessage<< "* DOUBLE/DOUBLE"<<std::endl;
GparityLatticeFermionD src_d(FGrid_d);
precisionChange(src_d,src);
LatticeGaugeFieldD Umu_d(UGrid_d);
precisionChange(Umu_d,Umu);
GparityLatticeFermionD result_d(FGrid_d);
GparityDomainWallFermionD DwD(Umu_d,*FGrid_d,*FrbGrid_d,*UGrid_d,*UrbGrid_d,mass,M5);
if (1) {
FGrid_d->Barrier();
DwD.ZeroCounters();
DwD.Dhop(src_d,result_d,0);
std::cout<<GridLogMessage<<"Called warmup"<<std::endl;
double t0=usecond();
for(int i=0;i<ncall;i++){
__SSC_START;
DwD.Dhop(src_d,result_d,0);
__SSC_STOP;
}
double t1=usecond();
FGrid_d->Barrier();
double volume=Ls; for(int mu=0;mu<Nd;mu++) volume=volume*latt4[mu];
double flops=2*1320*volume*ncall;
std::cout<<GridLogMessage << "Called Dw "<<ncall<<" times in "<<t1-t0<<" us"<<std::endl;
// std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl;
// std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
std::cout<<GridLogMessage << "mflop/s per rank = "<< flops/(t1-t0)/NP<<std::endl;
std::cout<<GridLogMessage << "mflop/s per node = "<< flops/(t1-t0)/NN<<std::endl;
DwD.Report();
}
Grid_finalize();
}

View File

@@ -66,7 +66,8 @@ int main (int argc, char ** argv)
Vec tsum; tsum = zero; Vec tsum; tsum = zero;
GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); GridParallelRNG pRNG(&Grid);
pRNG.SeedFixedIntegers(std::vector<int>({56,17,89,101}));
std::vector<double> stop(threads); std::vector<double> stop(threads);
Vector<Vec> sum(threads); Vector<Vec> sum(threads);
@@ -77,8 +78,7 @@ int main (int argc, char ** argv)
} }
double start=usecond(); double start=usecond();
PARALLEL_FOR_LOOP parallel_for(int t=0;t<threads;t++){
for(int t=0;t<threads;t++){
sum[t] = x[t]._odata[0]; sum[t] = x[t]._odata[0];
for(int i=0;i<Nloop;i++){ for(int i=0;i<Nloop;i++){

View File

@@ -55,21 +55,21 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
uint64_t lmax=44; uint64_t lmax=64;
#define NLOOP (1*lmax*lmax*lmax*lmax/vol) #define NLOOP (10*lmax*lmax*lmax*lmax/vol)
for(int lat=4;lat<=lmax;lat+=4){ for(int lat=8;lat<=lmax;lat+=8){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol= latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
uint64_t Nloop=NLOOP; uint64_t Nloop=NLOOP;
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); // GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeVec z(&Grid); //random(pRNG,z); LatticeVec z(&Grid);// random(pRNG,z);
LatticeVec x(&Grid); //random(pRNG,x); LatticeVec x(&Grid);// random(pRNG,x);
LatticeVec y(&Grid); //random(pRNG,y); LatticeVec y(&Grid);// random(pRNG,y);
double a=2.0; double a=2.0;
@@ -83,7 +83,7 @@ int main (int argc, char ** argv)
double time = (stop-start)/Nloop*1000; double time = (stop-start)/Nloop*1000;
double flops=vol*Nvec*2;// mul,add double flops=vol*Nvec*2;// mul,add
double bytes=3*vol*Nvec*sizeof(Real); double bytes=3.0*vol*Nvec*sizeof(Real);
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.<<std::endl; std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.<<std::endl;
} }
@@ -94,17 +94,17 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=4;lat<=lmax;lat+=4){ for(int lat=8;lat<=lmax;lat+=8){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol= latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); // GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeVec z(&Grid); //random(pRNG,z); LatticeVec z(&Grid);// random(pRNG,z);
LatticeVec x(&Grid); //random(pRNG,x); LatticeVec x(&Grid);// random(pRNG,x);
LatticeVec y(&Grid); //random(pRNG,y); LatticeVec y(&Grid);// random(pRNG,y);
double a=2.0; double a=2.0;
uint64_t Nloop=NLOOP; uint64_t Nloop=NLOOP;
@@ -119,7 +119,7 @@ int main (int argc, char ** argv)
double time = (stop-start)/Nloop*1000; double time = (stop-start)/Nloop*1000;
double flops=vol*Nvec*2;// mul,add double flops=vol*Nvec*2;// mul,add
double bytes=3*vol*Nvec*sizeof(Real); double bytes=3.0*vol*Nvec*sizeof(Real);
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.<<std::endl; std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.<<std::endl;
} }
@@ -129,20 +129,20 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl;
for(int lat=4;lat<=lmax;lat+=4){ for(int lat=8;lat<=lmax;lat+=8){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol= latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
uint64_t Nloop=NLOOP; uint64_t Nloop=NLOOP;
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); // GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeVec z(&Grid); //random(pRNG,z); LatticeVec z(&Grid);// random(pRNG,z);
LatticeVec x(&Grid); //random(pRNG,x); LatticeVec x(&Grid);// random(pRNG,x);
LatticeVec y(&Grid); //random(pRNG,y); LatticeVec y(&Grid);// random(pRNG,y);
RealD a=2.0; RealD a=2.0;
@@ -154,7 +154,7 @@ int main (int argc, char ** argv)
double stop=usecond(); double stop=usecond();
double time = (stop-start)/Nloop*1000; double time = (stop-start)/Nloop*1000;
double bytes=2*vol*Nvec*sizeof(Real); double bytes=2.0*vol*Nvec*sizeof(Real);
double flops=vol*Nvec*1;// mul double flops=vol*Nvec*1;// mul
std::cout<<GridLogMessage <<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.<<std::endl; std::cout<<GridLogMessage <<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<<"\t\t"<<(stop-start)/1000./1000.<<std::endl;
@@ -166,17 +166,17 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s"<<"\t\t"<<"Gflop/s"<<"\t\t seconds"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=4;lat<=lmax;lat+=4){ for(int lat=8;lat<=lmax;lat+=8){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol= latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
uint64_t Nloop=NLOOP; uint64_t Nloop=NLOOP;
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); // GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeVec z(&Grid); //random(pRNG,z); LatticeVec z(&Grid);// random(pRNG,z);
LatticeVec x(&Grid); //random(pRNG,x); LatticeVec x(&Grid);// random(pRNG,x);
LatticeVec y(&Grid); //random(pRNG,y); LatticeVec y(&Grid);// random(pRNG,y);
RealD a=2.0; RealD a=2.0;
Real nn; Real nn;
double start=usecond(); double start=usecond();
@@ -187,7 +187,7 @@ int main (int argc, char ** argv)
double stop=usecond(); double stop=usecond();
double time = (stop-start)/Nloop*1000; double time = (stop-start)/Nloop*1000;
double bytes=vol*Nvec*sizeof(Real); double bytes=1.0*vol*Nvec*sizeof(Real);
double flops=vol*Nvec*2;// mul,add double flops=vol*Nvec*2;// mul,add
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<< "\t\t"<<(stop-start)/1000./1000.<< "\t\t " <<std::endl; std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t"<<flops/time<< "\t\t"<<(stop-start)/1000./1000.<< "\t\t " <<std::endl;

View File

@@ -0,0 +1,803 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: ./benchmarks/Benchmark_wilson.cc
Copyright (C) 2018
Author: Peter Boyle <paboyle@ph.ed.ac.uk>
Author: paboyle <paboyle@ph.ed.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Grid.h>
using namespace std;
using namespace Grid;
using namespace Grid::QCD;
#include "Grid/util/Profiling.h"
template<class vobj>
void sliceInnerProductMesonField(std::vector< std::vector<ComplexD> > &mat,
const std::vector<Lattice<vobj> > &lhs,
const std::vector<Lattice<vobj> > &rhs,
int orthogdim)
{
typedef typename vobj::scalar_object sobj;
typedef typename vobj::scalar_type scalar_type;
typedef typename vobj::vector_type vector_type;
int Lblock = lhs.size();
int Rblock = rhs.size();
GridBase *grid = lhs[0]._grid;
const int Nd = grid->_ndimension;
const int Nsimd = grid->Nsimd();
int Nt = grid->GlobalDimensions()[orthogdim];
assert(mat.size()==Lblock*Rblock);
for(int t=0;t<mat.size();t++){
assert(mat[t].size()==Nt);
}
int fd=grid->_fdimensions[orthogdim];
int ld=grid->_ldimensions[orthogdim];
int rd=grid->_rdimensions[orthogdim];
// will locally sum vectors first
// sum across these down to scalars
// splitting the SIMD
std::vector<vector_type,alignedAllocator<vector_type> > lvSum(rd*Lblock*Rblock);
parallel_for (int r = 0; r < rd * Lblock * Rblock; r++){
lvSum[r] = zero;
}
std::vector<scalar_type > lsSum(ld*Lblock*Rblock,scalar_type(0.0));
int e1= grid->_slice_nblock[orthogdim];
int e2= grid->_slice_block [orthogdim];
int stride=grid->_slice_stride[orthogdim];
std::cout << GridLogMessage << " Entering first parallel loop "<<std::endl;
// Parallelise over t-direction doesn't expose as much parallelism as needed for KNL
parallel_for(int r=0;r<rd;r++){
int so=r*grid->_ostride[orthogdim]; // base offset for start of plane
for(int n=0;n<e1;n++){
for(int b=0;b<e2;b++){
int ss= so+n*stride+b;
for(int i=0;i<Lblock;i++){
auto left = conjugate(lhs[i]._odata[ss]);
for(int j=0;j<Rblock;j++){
int idx = i+Lblock*j+Lblock*Rblock*r;
auto right = rhs[j]._odata[ss];
vector_type vv = left()(0)(0) * right()(0)(0)
+ left()(0)(1) * right()(0)(1)
+ left()(0)(2) * right()(0)(2)
+ left()(1)(0) * right()(1)(0)
+ left()(1)(1) * right()(1)(1)
+ left()(1)(2) * right()(1)(2)
+ left()(2)(0) * right()(2)(0)
+ left()(2)(1) * right()(2)(1)
+ left()(2)(2) * right()(2)(2)
+ left()(3)(0) * right()(3)(0)
+ left()(3)(1) * right()(3)(1)
+ left()(3)(2) * right()(3)(2);
lvSum[idx]=lvSum[idx]+vv;
}
}
}
}
}
std::cout << GridLogMessage << " Entering second parallel loop "<<std::endl;
// Sum across simd lanes in the plane, breaking out orthog dir.
parallel_for(int rt=0;rt<rd;rt++){
std::vector<int> icoor(Nd);
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
iScalar<vector_type> temp;
std::vector<iScalar<scalar_type> > extracted(Nsimd);
temp._internal = lvSum[i+Lblock*j+Lblock*Rblock*rt];
extract(temp,extracted);
for(int idx=0;idx<Nsimd;idx++){
grid->iCoorFromIindex(icoor,idx);
int ldx =rt+icoor[orthogdim]*rd;
int ij_dx = i+Lblock*j+Lblock*Rblock*ldx;
lsSum[ij_dx]=lsSum[ij_dx]+extracted[idx]._internal;
}
}}
}
std::cout << GridLogMessage << " Entering non parallel loop "<<std::endl;
for(int t=0;t<fd;t++)
{
int pt = t / ld; // processor plane
int lt = t % ld;
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
if (pt == grid->_processor_coor[orthogdim]){
int ij_dx = i + Lblock * j + Lblock * Rblock * lt;
mat[i+j*Lblock][t] = lsSum[ij_dx];
}
else{
mat[i+j*Lblock][t] = scalar_type(0.0);
}
}}
}
std::cout << GridLogMessage << " Done "<<std::endl;
// defer sum over nodes.
return;
}
template<class vobj>
void sliceInnerProductMesonFieldGamma(std::vector< std::vector<ComplexD> > &mat,
const std::vector<Lattice<vobj> > &lhs,
const std::vector<Lattice<vobj> > &rhs,
int orthogdim,
std::vector<Gamma::Algebra> gammas)
{
typedef typename vobj::scalar_object sobj;
typedef typename vobj::scalar_type scalar_type;
typedef typename vobj::vector_type vector_type;
int Lblock = lhs.size();
int Rblock = rhs.size();
GridBase *grid = lhs[0]._grid;
const int Nd = grid->_ndimension;
const int Nsimd = grid->Nsimd();
int Nt = grid->GlobalDimensions()[orthogdim];
int Ngamma = gammas.size();
assert(mat.size()==Lblock*Rblock*Ngamma);
for(int t=0;t<mat.size();t++){
assert(mat[t].size()==Nt);
}
int fd=grid->_fdimensions[orthogdim];
int ld=grid->_ldimensions[orthogdim];
int rd=grid->_rdimensions[orthogdim];
// will locally sum vectors first
// sum across these down to scalars
// splitting the SIMD
int MFrvol = rd*Lblock*Rblock*Ngamma;
int MFlvol = ld*Lblock*Rblock*Ngamma;
std::vector<vector_type,alignedAllocator<vector_type> > lvSum(MFrvol);
parallel_for (int r = 0; r < MFrvol; r++){
lvSum[r] = zero;
}
std::vector<scalar_type > lsSum(MFlvol);
parallel_for (int r = 0; r < MFlvol; r++){
lsSum[r]=scalar_type(0.0);
}
int e1= grid->_slice_nblock[orthogdim];
int e2= grid->_slice_block [orthogdim];
int stride=grid->_slice_stride[orthogdim];
std::cout << GridLogMessage << " Entering first parallel loop "<<std::endl;
// Parallelise over t-direction doesn't expose as much parallelism as needed for KNL
parallel_for(int r=0;r<rd;r++){
int so=r*grid->_ostride[orthogdim]; // base offset for start of plane
for(int n=0;n<e1;n++){
for(int b=0;b<e2;b++){
int ss= so+n*stride+b;
for(int i=0;i<Lblock;i++){
auto left = conjugate(lhs[i]._odata[ss]);
for(int j=0;j<Rblock;j++){
for(int mu=0;mu<Ngamma;mu++){
auto right = Gamma(gammas[mu])*rhs[j]._odata[ss];
vector_type vv = left()(0)(0) * right()(0)(0)
+ left()(0)(1) * right()(0)(1)
+ left()(0)(2) * right()(0)(2)
+ left()(1)(0) * right()(1)(0)
+ left()(1)(1) * right()(1)(1)
+ left()(1)(2) * right()(1)(2)
+ left()(2)(0) * right()(2)(0)
+ left()(2)(1) * right()(2)(1)
+ left()(2)(2) * right()(2)(2)
+ left()(3)(0) * right()(3)(0)
+ left()(3)(1) * right()(3)(1)
+ left()(3)(2) * right()(3)(2);
int idx = mu+i*Ngamma+Lblock*Ngamma*j+Ngamma*Lblock*Rblock*r;
lvSum[idx]=lvSum[idx]+vv;
}
}
}
}
}
}
std::cout << GridLogMessage << " Entering second parallel loop "<<std::endl;
// Sum across simd lanes in the plane, breaking out orthog dir.
parallel_for(int rt=0;rt<rd;rt++){
iScalar<vector_type> temp;
std::vector<int> icoor(Nd);
std::vector<iScalar<scalar_type> > extracted(Nsimd);
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
for(int mu=0;mu<Ngamma;mu++){
int ij_rdx = mu+i*Ngamma+Ngamma*Lblock*j+Ngamma*Lblock*Rblock*rt;
temp._internal = lvSum[ij_rdx];
extract(temp,extracted);
for(int idx=0;idx<Nsimd;idx++){
grid->iCoorFromIindex(icoor,idx);
int ldx =rt+icoor[orthogdim]*rd;
int ij_ldx = mu+i*Ngamma+Ngamma*Lblock*j+Ngamma*Lblock*Rblock*ldx;
lsSum[ij_ldx]=lsSum[ij_ldx]+extracted[idx]._internal;
}
}}}
}
std::cout << GridLogMessage << " Entering non parallel loop "<<std::endl;
for(int t=0;t<fd;t++)
{
int pt = t / ld; // processor plane
int lt = t % ld;
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
for(int mu=0;mu<Ngamma;mu++){
if (pt == grid->_processor_coor[orthogdim]){
int ij_dx = mu+i*Ngamma+Ngamma*Lblock*j+Ngamma*Lblock*Rblock* lt;
mat[mu+i*Ngamma+j*Lblock*Ngamma][t] = lsSum[ij_dx];
}
else{
mat[mu+i*Ngamma+j*Lblock*Ngamma][t] = scalar_type(0.0);
}
}}}
}
std::cout << GridLogMessage << " Done "<<std::endl;
// defer sum over nodes.
return;
}
template<class vobj>
void sliceInnerProductMesonFieldGamma1(std::vector< std::vector<ComplexD> > &mat,
const std::vector<Lattice<vobj> > &lhs,
const std::vector<Lattice<vobj> > &rhs,
int orthogdim,
std::vector<Gamma::Algebra> gammas)
{
typedef typename vobj::scalar_object sobj;
typedef typename vobj::scalar_type scalar_type;
typedef typename vobj::vector_type vector_type;
typedef iSpinMatrix<vector_type> SpinMatrix_v;
typedef iSpinMatrix<scalar_type> SpinMatrix_s;
int Lblock = lhs.size();
int Rblock = rhs.size();
GridBase *grid = lhs[0]._grid;
const int Nd = grid->_ndimension;
const int Nsimd = grid->Nsimd();
int Nt = grid->GlobalDimensions()[orthogdim];
int Ngamma = gammas.size();
assert(mat.size()==Lblock*Rblock*Ngamma);
for(int t=0;t<mat.size();t++){
assert(mat[t].size()==Nt);
}
int fd=grid->_fdimensions[orthogdim];
int ld=grid->_ldimensions[orthogdim];
int rd=grid->_rdimensions[orthogdim];
// will locally sum vectors first
// sum across these down to scalars
// splitting the SIMD
int MFrvol = rd*Lblock*Rblock;
int MFlvol = ld*Lblock*Rblock;
Vector<SpinMatrix_v > lvSum(MFrvol);
parallel_for (int r = 0; r < MFrvol; r++){
lvSum[r] = zero;
}
Vector<SpinMatrix_s > lsSum(MFlvol);
parallel_for (int r = 0; r < MFlvol; r++){
lsSum[r]=scalar_type(0.0);
}
int e1= grid->_slice_nblock[orthogdim];
int e2= grid->_slice_block [orthogdim];
int stride=grid->_slice_stride[orthogdim];
std::cout << GridLogMessage << " Entering first parallel loop "<<std::endl;
// Parallelise over t-direction doesn't expose as much parallelism as needed for KNL
parallel_for(int r=0;r<rd;r++){
int so=r*grid->_ostride[orthogdim]; // base offset for start of plane
for(int n=0;n<e1;n++){
for(int b=0;b<e2;b++){
int ss= so+n*stride+b;
for(int i=0;i<Lblock;i++){
auto left = conjugate(lhs[i]._odata[ss]);
for(int j=0;j<Rblock;j++){
SpinMatrix_v vv;
auto right = rhs[j]._odata[ss];
for(int s1=0;s1<Ns;s1++){
for(int s2=0;s2<Ns;s2++){
vv()(s2,s1)() = left()(s1)(0) * right()(s2)(0)
+ left()(s1)(1) * right()(s2)(1)
+ left()(s1)(2) * right()(s2)(2);
}}
int idx = i+Lblock*j+Lblock*Rblock*r;
lvSum[idx]=lvSum[idx]+vv;
}
}
}
}
}
std::cout << GridLogMessage << " Entering second parallel loop "<<std::endl;
// Sum across simd lanes in the plane, breaking out orthog dir.
parallel_for(int rt=0;rt<rd;rt++){
std::vector<int> icoor(Nd);
std::vector<SpinMatrix_s> extracted(Nsimd);
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
int ij_rdx = i+Lblock*j+Lblock*Rblock*rt;
extract(lvSum[ij_rdx],extracted);
for(int idx=0;idx<Nsimd;idx++){
grid->iCoorFromIindex(icoor,idx);
int ldx = rt+icoor[orthogdim]*rd;
int ij_ldx = i+Lblock*j+Lblock*Rblock*ldx;
lsSum[ij_ldx]=lsSum[ij_ldx]+extracted[idx];
}
}}
}
std::cout << GridLogMessage << " Entering third parallel loop "<<std::endl;
parallel_for(int t=0;t<fd;t++)
{
int pt = t / ld; // processor plane
int lt = t % ld;
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
if (pt == grid->_processor_coor[orthogdim]){
int ij_dx = i + Lblock * j + Lblock * Rblock * lt;
for(int mu=0;mu<Ngamma;mu++){
mat[mu+i*Ngamma+j*Lblock*Ngamma][t] = trace(lsSum[ij_dx]*Gamma(gammas[mu]));
}
}
else{
for(int mu=0;mu<Ngamma;mu++){
mat[mu+i*Ngamma+j*Lblock*Ngamma][t] = scalar_type(0.0);
}
}
}}
}
std::cout << GridLogMessage << " Done "<<std::endl;
// defer sum over nodes.
return;
}
template<class vobj>
void sliceInnerProductMesonFieldGammaMom(std::vector< std::vector<ComplexD> > &mat,
const std::vector<Lattice<vobj> > &lhs,
const std::vector<Lattice<vobj> > &rhs,
int orthogdim,
std::vector<Gamma::Algebra> gammas,
const std::vector<LatticeComplex > &mom)
{
typedef typename vobj::scalar_object sobj;
typedef typename vobj::scalar_type scalar_type;
typedef typename vobj::vector_type vector_type;
typedef iSpinMatrix<vector_type> SpinMatrix_v;
typedef iSpinMatrix<scalar_type> SpinMatrix_s;
int Lblock = lhs.size();
int Rblock = rhs.size();
GridBase *grid = lhs[0]._grid;
const int Nd = grid->_ndimension;
const int Nsimd = grid->Nsimd();
int Nt = grid->GlobalDimensions()[orthogdim];
int Ngamma = gammas.size();
int Nmom = mom.size();
assert(mat.size()==Lblock*Rblock*Ngamma*Nmom);
for(int t=0;t<mat.size();t++){
assert(mat[t].size()==Nt);
}
int fd=grid->_fdimensions[orthogdim];
int ld=grid->_ldimensions[orthogdim];
int rd=grid->_rdimensions[orthogdim];
// will locally sum vectors first
// sum across these down to scalars
// splitting the SIMD
int MFrvol = rd*Lblock*Rblock*Nmom;
int MFlvol = ld*Lblock*Rblock*Nmom;
Vector<SpinMatrix_v > lvSum(MFrvol);
parallel_for (int r = 0; r < MFrvol; r++){
lvSum[r] = zero;
}
Vector<SpinMatrix_s > lsSum(MFlvol);
parallel_for (int r = 0; r < MFlvol; r++){
lsSum[r]=scalar_type(0.0);
}
int e1= grid->_slice_nblock[orthogdim];
int e2= grid->_slice_block [orthogdim];
int stride=grid->_slice_stride[orthogdim];
std::cout << GridLogMessage << " Entering first parallel loop "<<std::endl;
// Parallelise over t-direction doesn't expose as much parallelism as needed for KNL
parallel_for(int r=0;r<rd;r++){
int so=r*grid->_ostride[orthogdim]; // base offset for start of plane
for(int n=0;n<e1;n++){
for(int b=0;b<e2;b++){
int ss= so+n*stride+b;
for(int i=0;i<Lblock;i++){
auto left = conjugate(lhs[i]._odata[ss]);
for(int j=0;j<Rblock;j++){
SpinMatrix_v vv;
auto right = rhs[j]._odata[ss];
for(int s1=0;s1<Ns;s1++){
for(int s2=0;s2<Ns;s2++){
vv()(s1,s2)() = left()(s1)(0) * right()(s2)(0)
+ left()(s1)(1) * right()(s2)(1)
+ left()(s1)(2) * right()(s2)(2);
}}
// After getting the sitewise product do the mom phase loop
int base = Nmom*i+Nmom*Lblock*j+Nmom*Lblock*Rblock*r;
// Trigger unroll
for ( int m=0;m<Nmom;m++){
int idx = m+base;
auto phase = mom[m]._odata[ss];
mac(&lvSum[idx],&vv,&phase);
}
}
}
}
}
}
std::cout << GridLogMessage << " Entering second parallel loop "<<std::endl;
// Sum across simd lanes in the plane, breaking out orthog dir.
parallel_for(int rt=0;rt<rd;rt++){
std::vector<int> icoor(Nd);
std::vector<SpinMatrix_s> extracted(Nsimd);
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
for(int m=0;m<Nmom;m++){
int ij_rdx = m+Nmom*i+Nmom*Lblock*j+Nmom*Lblock*Rblock*rt;
extract(lvSum[ij_rdx],extracted);
for(int idx=0;idx<Nsimd;idx++){
grid->iCoorFromIindex(icoor,idx);
int ldx = rt+icoor[orthogdim]*rd;
int ij_ldx = m+Nmom*i+Nmom*Lblock*j+Nmom*Lblock*Rblock*ldx;
lsSum[ij_ldx]=lsSum[ij_ldx]+extracted[idx];
}
}}}
}
std::cout << GridLogMessage << " Entering third parallel loop "<<std::endl;
parallel_for(int t=0;t<fd;t++)
{
int pt = t / ld; // processor plane
int lt = t % ld;
for(int i=0;i<Lblock;i++){
for(int j=0;j<Rblock;j++){
if (pt == grid->_processor_coor[orthogdim]){
for(int m=0;m<Nmom;m++){
int ij_dx = m+Nmom*i + Nmom*Lblock * j + Nmom*Lblock * Rblock * lt;
for(int mu=0;mu<Ngamma;mu++){
mat[ mu
+m*Ngamma
+i*Nmom*Ngamma
+j*Nmom*Ngamma*Lblock][t] = trace(lsSum[ij_dx]*Gamma(gammas[mu]));
}
}
}
else{
for(int mu=0;mu<Ngamma;mu++){
for(int m=0;m<Nmom;m++){
mat[mu+m*Ngamma+i*Nmom*Ngamma+j*Nmom*Lblock*Ngamma][t] = scalar_type(0.0);
}}
}
}}
}
std::cout << GridLogMessage << " Done "<<std::endl;
// defer sum over nodes.
return;
}
/*
template void sliceInnerProductMesonField<SpinColourVector>(std::vector< std::vector<ComplexD> > &mat,
const std::vector<Lattice<SpinColourVector> > &lhs,
const std::vector<Lattice<SpinColourVector> > &rhs,
int orthogdim) ;
*/
std::vector<Gamma::Algebra> Gmu4 ( {
Gamma::Algebra::GammaX,
Gamma::Algebra::GammaY,
Gamma::Algebra::GammaZ,
Gamma::Algebra::GammaT });
std::vector<Gamma::Algebra> Gmu16 ( {
Gamma::Algebra::Gamma5,
Gamma::Algebra::GammaT,
Gamma::Algebra::GammaTGamma5,
Gamma::Algebra::GammaX,
Gamma::Algebra::GammaXGamma5,
Gamma::Algebra::GammaY,
Gamma::Algebra::GammaYGamma5,
Gamma::Algebra::GammaZ,
Gamma::Algebra::GammaZGamma5,
Gamma::Algebra::Identity,
Gamma::Algebra::SigmaXT,
Gamma::Algebra::SigmaXY,
Gamma::Algebra::SigmaXZ,
Gamma::Algebra::SigmaYT,
Gamma::Algebra::SigmaYZ,
Gamma::Algebra::SigmaZT
});
int main (int argc, char ** argv)
{
Grid_init(&argc,&argv);
std::vector<int> latt_size = GridDefaultLatt();
std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd());
std::vector<int> mpi_layout = GridDefaultMpi();
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
const int Nmom=7;
int nt = latt_size[Tp];
uint64_t vol = 1;
for(int d=0;d<Nd;d++){
vol = vol*latt_size[d];
}
std::vector<int> seeds({1,2,3,4});
GridParallelRNG pRNG(&Grid);
pRNG.SeedFixedIntegers(seeds);
int Nm = atoi(argv[1]); // number of all modes (high + low)
std::vector<LatticeFermion> v(Nm,&Grid);
std::vector<LatticeFermion> w(Nm,&Grid);
std::vector<LatticeFermion> gammaV(Nm,&Grid);
std::vector<LatticeComplex> phases(Nmom,&Grid);
for(int i=0;i<Nm;i++) {
random(pRNG,v[i]);
random(pRNG,w[i]);
}
for(int i=0;i<Nmom;i++) {
phases[i] = Complex(1.0);
}
double flops = vol * (11.0 * 8.0 + 6.0) * Nm*Nm;
double byte = vol * (12.0 * sizeof(Complex) ) * Nm*Nm;
std::vector<ComplexD> ip(nt);
std::vector<std::vector<ComplexD> > MesonFields (Nm*Nm);
std::vector<std::vector<ComplexD> > MesonFields4 (Nm*Nm*4);
std::vector<std::vector<ComplexD> > MesonFields16 (Nm*Nm*16);
std::vector<std::vector<ComplexD> > MesonFields161(Nm*Nm*16);
std::vector<std::vector<ComplexD> > MesonFields16mom (Nm*Nm*16*Nmom);
std::vector<std::vector<ComplexD> > MesonFieldsRef(Nm*Nm);
for(int i=0;i<MesonFields.size();i++ ) MesonFields [i].resize(nt);
for(int i=0;i<MesonFieldsRef.size();i++) MesonFieldsRef[i].resize(nt);
for(int i=0;i<MesonFields4.size();i++ ) MesonFields4 [i].resize(nt);
for(int i=0;i<MesonFields16.size();i++ ) MesonFields16 [i].resize(nt);
for(int i=0;i<MesonFields161.size();i++ ) MesonFields161[i].resize(nt);
for(int i=0;i<MesonFields16mom.size();i++ ) MesonFields16mom [i].resize(nt);
GridLogMessage.TimingMode(1);
std::cout<<GridLogMessage << "Running loop with sliceInnerProductVector"<<std::endl;
double t0 = usecond();
for(int i=0;i<Nm;i++) {
for(int j=0;j<Nm;j++) {
sliceInnerProductVector(ip, w[i],v[j],Tp);
for(int t=0;t<nt;t++){
MesonFieldsRef[i+j*Nm][t] = ip[t];
}
}}
double t1 = usecond();
std::cout<<GridLogMessage << "Done "<< (t1-t0) <<" usecond " <<std::endl;
std::cout<<GridLogMessage << "Done "<< flops/(t1-t0) <<" mflops " <<std::endl;
std::cout<<GridLogMessage << "Done "<< byte /(t1-t0) <<" MB/s " <<std::endl;
std::cout<<GridLogMessage << "Running loop with new code for Nt="<<nt<<std::endl;
t0 = usecond();
sliceInnerProductMesonField(MesonFields,w,v,Tp);
t1 = usecond();
std::cout<<GridLogMessage << "Done "<< (t1-t0) <<" usecond " <<std::endl;
std::cout<<GridLogMessage << "Done "<< flops/(t1-t0) <<" mflops " <<std::endl;
std::cout<<GridLogMessage << "Done "<< byte /(t1-t0) <<" MB/s " <<std::endl;
std::cout<<GridLogMessage << "Running loop with Four gammas code for Nt="<<nt<<std::endl;
flops = vol * (11.0 * 8.0 + 6.0) * Nm*Nm*4;
byte = vol * (12.0 * sizeof(Complex) ) * Nm*Nm
+ vol * ( 2.0 * sizeof(Complex) ) * Nm*Nm* 4;
t0 = usecond();
sliceInnerProductMesonFieldGamma(MesonFields4,w,v,Tp,Gmu4);
t1 = usecond();
std::cout<<GridLogMessage << "Done "<< (t1-t0) <<" usecond " <<std::endl;
std::cout<<GridLogMessage << "Done "<< flops/(t1-t0) <<" mflops " <<std::endl;
std::cout<<GridLogMessage << "Done "<< byte /(t1-t0) <<" MB/s " <<std::endl;
std::cout<<GridLogMessage << "Running loop with Sixteen gammas code for Nt="<<nt<<std::endl;
flops = vol * (11.0 * 8.0 + 6.0) * Nm*Nm*16;
byte = vol * (12.0 * sizeof(Complex) ) * Nm*Nm
+ vol * ( 2.0 * sizeof(Complex) ) * Nm*Nm* 16;
t0 = usecond();
sliceInnerProductMesonFieldGamma(MesonFields16,w,v,Tp,Gmu16);
t1 = usecond();
std::cout<<GridLogMessage << "Done "<< (t1-t0) <<" usecond " <<std::endl;
std::cout<<GridLogMessage << "Done "<< flops/(t1-t0) <<" mflops " <<std::endl;
std::cout<<GridLogMessage << "Done "<< byte /(t1-t0) <<" MB/s " <<std::endl;
std::cout<<GridLogMessage << "Running loop with Sixteen gammas code1 for Nt="<<nt<<std::endl;
flops = vol * ( 2 * 8.0 + 6.0) * Nm*Nm*16;
byte = vol * (12.0 * sizeof(Complex) ) * Nm*Nm
+ vol * ( 2.0 * sizeof(Complex) ) * Nm*Nm* 16;
t0 = usecond();
sliceInnerProductMesonFieldGamma1(MesonFields161, w, v, Tp, Gmu16);
t1 = usecond();
std::cout<<GridLogMessage << "Done "<< (t1-t0) <<" usecond " <<std::endl;
std::cout<<GridLogMessage << "Done "<< flops/(t1-t0) <<" mflops " <<std::endl;
std::cout<<GridLogMessage << "Done "<< byte /(t1-t0) <<" MB/s " <<std::endl;
std::cout<<GridLogMessage << "Running loop with Sixteen gammas "<<Nmom<<" momenta "<<std::endl;
flops = vol * ( 2 * 8.0 + 6.0 + 8.0*Nmom) * Nm*Nm*16;
byte = vol * (12.0 * sizeof(Complex) ) * Nm*Nm
+ vol * ( 2.0 * sizeof(Complex) *Nmom ) * Nm*Nm* 16;
t0 = usecond();
sliceInnerProductMesonFieldGammaMom(MesonFields16mom,w,v,Tp,Gmu16,phases);
t1 = usecond();
std::cout<<GridLogMessage << "Done "<< (t1-t0) <<" usecond " <<std::endl;
std::cout<<GridLogMessage << "Done "<< flops/(t1-t0) <<" mflops " <<std::endl;
std::cout<<GridLogMessage << "Done "<< byte /(t1-t0) <<" MB/s " <<std::endl;
RealD err = 0;
RealD err2 = 0;
ComplexD diff;
ComplexD diff2;
for(int i=0;i<Nm;i++) {
for(int j=0;j<Nm;j++) {
for(int t=0;t<nt;t++){
diff = MesonFields[i+Nm*j][t] - MesonFieldsRef[i+Nm*j][t];
err += real(diff*conj(diff));
}
}}
std::cout<<GridLogMessage << "Norm error "<< err <<std::endl;
err = err*0.;
diff = diff*0.;
for (int mu = 0; mu < 16; mu++){
for (int k = 0; k < gammaV.size(); k++){
gammaV[k] = Gamma(Gmu16[mu]) * v[k];
}
for (int i = 0; i < Nm; i++){
for (int j = 0; j < Nm; j++){
sliceInnerProductVector(ip, w[i], gammaV[j], Tp);
for (int t = 0; t < nt; t++){
MesonFields[i + j * Nm][t] = ip[t];
diff = MesonFields16[mu+i*16+Nm*16*j][t] - MesonFields161[mu+i*16+Nm*16*j][t];
diff2 = MesonFields[i+j*Nm][t] - MesonFields161[mu+i*16+Nm*16*j][t];
err += real(diff*conj(diff));
err2 += real(diff2*conj(diff2));
}
}
}
}
std::cout << GridLogMessage << "Norm error 16 gamma1/16 gamma naive " << err << std::endl;
std::cout << GridLogMessage << "Norm error 16 gamma1/sliceInnerProduct " << err2 << std::endl;
Grid_finalize();
}

View File

@@ -0,0 +1,222 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: ./benchmarks/Benchmark_dwf.cc
Copyright (C) 2015
Author: Peter Boyle <paboyle@ph.ed.ac.uk>
Author: paboyle <paboyle@ph.ed.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Grid.h>
using namespace std;
using namespace Grid;
using namespace Grid::QCD;
int main (int argc, char ** argv)
{
Grid_init(&argc,&argv);
int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
std::vector<int> latt4 = GridDefaultLatt();
const int Ls=16;
GridCartesian * UGrid = SpaceTimeGrid::makeFourDimGrid(GridDefaultLatt(), GridDefaultSimd(Nd,vComplex::Nsimd()),GridDefaultMpi());
GridRedBlackCartesian * UrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(UGrid);
GridCartesian * FGrid = SpaceTimeGrid::makeFiveDimGrid(Ls,UGrid);
GridRedBlackCartesian * FrbGrid = SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls,UGrid);
std::cout << GridLogMessage << "Making Vec5d innermost grids"<<std::endl;
GridCartesian * sUGrid = SpaceTimeGrid::makeFourDimDWFGrid(GridDefaultLatt(),GridDefaultMpi());
GridRedBlackCartesian * sUrbGrid = SpaceTimeGrid::makeFourDimRedBlackGrid(sUGrid);
GridCartesian * sFGrid = SpaceTimeGrid::makeFiveDimDWFGrid(Ls,UGrid);
GridRedBlackCartesian * sFrbGrid = SpaceTimeGrid::makeFiveDimDWFRedBlackGrid(Ls,UGrid);
std::vector<int> seeds4({1,2,3,4});
std::vector<int> seeds5({5,6,7,8});
GridParallelRNG RNG4(UGrid); RNG4.SeedFixedIntegers(seeds4);
std::cout << GridLogMessage << "Seeded"<<std::endl;
LatticeGaugeField Umu(UGrid); SU3::HotConfiguration(RNG4,Umu);
std::cout << GridLogMessage << "made random gauge fields"<<std::endl;
RealD mass=0.1;
RealD M5 =1.8;
RealD NP = UGrid->_Nprocessors;
if (1)
{
const int ncall=1000;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking DomainWallFermionR::Dhop "<<std::endl;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
GridParallelRNG RNG5(FGrid);
LatticeFermion src(FGrid); random(RNG5,src);
LatticeFermion result(FGrid);
DomainWallFermionR Dw(Umu,*FGrid,*FrbGrid,*UGrid,*UrbGrid,mass,M5);
double t0,t1;
LatticeFermion r_eo(FGrid);
LatticeFermion src_e (FrbGrid);
LatticeFermion src_o (FrbGrid);
LatticeFermion r_e (FrbGrid);
LatticeFermion r_o (FrbGrid);
pickCheckerboard(Even,src_e,src);
pickCheckerboard(Odd,src_o,src);
setCheckerboard(r_eo,src_o);
setCheckerboard(r_eo,src_e);
r_e = zero;
r_o = zero;
#define BENCH_DW(A,in,out) \
Dw.CayleyZeroCounters(); \
Dw. A (in,out); \
FGrid->Barrier(); \
t0=usecond(); \
for(int i=0;i<ncall;i++){ \
Dw. A (in,out); \
} \
t1=usecond(); \
FGrid->Barrier(); \
Dw.CayleyReport(); \
std::cout<<GridLogMessage << "Called " #A " "<< (t1-t0)/ncall<<" us"<<std::endl;\
std::cout<<GridLogMessage << "******************"<<std::endl;
#define BENCH_ZDW(A,in,out) \
zDw.CayleyZeroCounters(); \
zDw. A (in,out); \
FGrid->Barrier(); \
t0=usecond(); \
for(int i=0;i<ncall;i++){ \
zDw. A (in,out); \
} \
t1=usecond(); \
FGrid->Barrier(); \
zDw.CayleyReport(); \
std::cout<<GridLogMessage << "Called ZDw " #A " "<< (t1-t0)/ncall<<" us"<<std::endl;\
std::cout<<GridLogMessage << "******************"<<std::endl;
#define BENCH_DW_SSC(A,in,out) \
Dw.CayleyZeroCounters(); \
Dw. A (in,out); \
FGrid->Barrier(); \
t0=usecond(); \
for(int i=0;i<ncall;i++){ \
__SSC_START ; \
Dw. A (in,out); \
__SSC_STOP ; \
} \
t1=usecond(); \
FGrid->Barrier(); \
Dw.CayleyReport(); \
std::cout<<GridLogMessage << "Called " #A " "<< (t1-t0)/ncall<<" us"<<std::endl;\
std::cout<<GridLogMessage << "******************"<<std::endl;
#define BENCH_DW_MEO(A,in,out) \
Dw.CayleyZeroCounters(); \
Dw. A (in,out,0); \
FGrid->Barrier(); \
t0=usecond(); \
for(int i=0;i<ncall;i++){ \
Dw. A (in,out,0); \
} \
t1=usecond(); \
FGrid->Barrier(); \
Dw.CayleyReport(); \
std::cout<<GridLogMessage << "Called " #A " "<< (t1-t0)/ncall<<" us"<<std::endl;\
std::cout<<GridLogMessage << "******************"<<std::endl;
BENCH_DW_MEO(Dhop ,src,result);
BENCH_DW_MEO(DhopEO ,src_o,r_e);
BENCH_DW(Meooe ,src_o,r_e);
BENCH_DW(Mooee ,src_o,r_o);
BENCH_DW(MooeeInv,src_o,r_o);
}
if (1)
{
const int ncall=1000;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking DomainWallFermionVec5dR::Dhop "<<std::endl;
std::cout << GridLogMessage<< "*********************************************************" <<std::endl;
GridParallelRNG RNG5(sFGrid);
LatticeFermion src(sFGrid); random(RNG5,src);
LatticeFermion sref(sFGrid);
LatticeFermion result(sFGrid);
std::cout<<GridLogMessage << "Constructing Vec5D Dw "<<std::endl;
DomainWallFermionVec5dR Dw(Umu,*sFGrid,*sFrbGrid,*sUGrid,*sUrbGrid,mass,M5);
RealD b=1.5;// Scale factor b+c=2, b-c=1
RealD c=0.5;
std::vector<ComplexD> gamma(Ls,std::complex<double>(1.0,0.0));
ZMobiusFermionVec5dR zDw(Umu,*sFGrid,*sFrbGrid,*sUGrid,*sUrbGrid,mass,M5,gamma,b,c);
std::cout<<GridLogMessage << "Calling Dhop "<<std::endl;
FGrid->Barrier();
double t0,t1;
LatticeFermion r_eo(sFGrid);
LatticeFermion src_e (sFrbGrid);
LatticeFermion src_o (sFrbGrid);
LatticeFermion r_e (sFrbGrid);
LatticeFermion r_o (sFrbGrid);
pickCheckerboard(Even,src_e,src);
pickCheckerboard(Odd,src_o,src);
setCheckerboard(r_eo,src_o);
setCheckerboard(r_eo,src_e);
r_e = zero;
r_o = zero;
BENCH_DW_MEO(Dhop ,src,result);
BENCH_DW_MEO(DhopEO ,src_o,r_e);
BENCH_DW_SSC(Meooe ,src_o,r_e);
BENCH_DW(Mooee ,src_o,r_o);
BENCH_DW(MooeeInv,src_o,r_o);
BENCH_ZDW(Mooee ,src_o,r_o);
BENCH_ZDW(MooeeInv,src_o,r_o);
}
Grid_finalize();
}

View File

@@ -0,0 +1,134 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: ./benchmarks/Benchmark_staggered.cc
Copyright (C) 2015
Author: Peter Boyle <paboyle@ph.ed.ac.uk>
Author: paboyle <paboyle@ph.ed.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Grid.h>
using namespace std;
using namespace Grid;
using namespace Grid::QCD;
int main (int argc, char ** argv)
{
Grid_init(&argc,&argv);
std::vector<int> latt_size = GridDefaultLatt();
std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd());
std::vector<int> mpi_layout = GridDefaultMpi();
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
GridRedBlackCartesian RBGrid(&Grid);
int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
std::cout<<GridLogMessage << "Grid floating point word size is REALF"<< sizeof(RealF)<<std::endl;
std::cout<<GridLogMessage << "Grid floating point word size is REALD"<< sizeof(RealD)<<std::endl;
std::cout<<GridLogMessage << "Grid floating point word size is REAL"<< sizeof(Real)<<std::endl;
std::vector<int> seeds({1,2,3,4});
GridParallelRNG pRNG(&Grid);
pRNG.SeedFixedIntegers(seeds);
// pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9});
typedef typename ImprovedStaggeredFermionR::FermionField FermionField;
typename ImprovedStaggeredFermionR::ImplParams params;
FermionField src (&Grid); random(pRNG,src);
FermionField result(&Grid); result=zero;
FermionField ref(&Grid); ref=zero;
FermionField tmp(&Grid); tmp=zero;
FermionField err(&Grid); tmp=zero;
LatticeGaugeField Umu(&Grid); random(pRNG,Umu);
std::vector<LatticeColourMatrix> U(4,&Grid);
double volume=1;
for(int mu=0;mu<Nd;mu++){
volume=volume*latt_size[mu];
}
// Only one non-zero (y)
#if 0
Umu=zero;
Complex cone(1.0,0.0);
for(int nn=0;nn<Nd;nn++){
random(pRNG,U[nn]);
if(1) {
if (nn!=2) { U[nn]=zero; std::cout<<GridLogMessage << "zeroing gauge field in dir "<<nn<<std::endl; }
// else { U[nn]= cone;std::cout<<GridLogMessage << "unit gauge field in dir "<<nn<<std::endl; }
else { std::cout<<GridLogMessage << "random gauge field in dir "<<nn<<std::endl; }
}
PokeIndex<LorentzIndex>(Umu,U[nn],nn);
}
#endif
for(int mu=0;mu<Nd;mu++){
U[mu] = PeekIndex<LorentzIndex>(Umu,mu);
}
ref = zero;
/*
{ // Naive wilson implementation
ref = zero;
for(int mu=0;mu<Nd;mu++){
// ref = src + Gamma(Gamma::GammaX)* src ; // 1-gamma_x
tmp = U[mu]*Cshift(src,mu,1);
for(int i=0;i<ref._odata.size();i++){
ref._odata[i]+= tmp._odata[i] - Gamma(Gmu[mu])*tmp._odata[i]; ;
}
tmp =adj(U[mu])*src;
tmp =Cshift(tmp,mu,-1);
for(int i=0;i<ref._odata.size();i++){
ref._odata[i]+= tmp._odata[i] + Gamma(Gmu[mu])*tmp._odata[i]; ;
}
}
}
ref = -0.5*ref;
*/
RealD mass=0.1;
RealD c1=9.0/8.0;
RealD c2=-1.0/24.0;
RealD u0=1.0;
ImprovedStaggeredFermionR Ds(Umu,Umu,Grid,RBGrid,mass,c1,c2,u0,params);
std::cout<<GridLogMessage << "Calling Ds"<<std::endl;
int ncall=1000;
double t0=usecond();
for(int i=0;i<ncall;i++){
Ds.Dhop(src,result,0);
}
double t1=usecond();
double flops=(16*(3*(6+8+8)) + 15*3*2)*volume*ncall; // == 66*16 + == 1146
std::cout<<GridLogMessage << "Called Ds"<<std::endl;
std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl;
std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
err = ref-result;
std::cout<<GridLogMessage << "norm diff "<< norm2(err)<<std::endl;
Grid_finalize();
}

View File

@@ -35,13 +35,16 @@ using namespace Grid::QCD;
int main (int argc, char ** argv) int main (int argc, char ** argv)
{ {
Grid_init(&argc,&argv); Grid_init(&argc,&argv);
#define LMAX (32)
#define LMIN (16)
#define LINC (4)
int Nloop=1000; int64_t Nloop=2000;
std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd()); std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd());
std::vector<int> mpi_layout = GridDefaultMpi(); std::vector<int> mpi_layout = GridDefaultMpi();
int threads = GridThread::GetThreads(); int64_t threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl; std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl; std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
@@ -50,19 +53,19 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=2;lat<=32;lat+=2){ for(int lat=LMIN;lat<=LMAX;lat+=LINC){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeColourMatrix z(&Grid);// random(pRNG,z); LatticeColourMatrix z(&Grid); random(pRNG,z);
LatticeColourMatrix x(&Grid);// random(pRNG,x); LatticeColourMatrix x(&Grid); random(pRNG,x);
LatticeColourMatrix y(&Grid);// random(pRNG,y); LatticeColourMatrix y(&Grid); random(pRNG,y);
double start=usecond(); double start=usecond();
for(int i=0;i<Nloop;i++){ for(int64_t i=0;i<Nloop;i++){
x=x*y; x=x*y;
} }
double stop=usecond(); double stop=usecond();
@@ -82,20 +85,20 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=2;lat<=32;lat+=2){ for(int lat=LMIN;lat<=LMAX;lat+=LINC){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeColourMatrix z(&Grid); //random(pRNG,z); LatticeColourMatrix z(&Grid); random(pRNG,z);
LatticeColourMatrix x(&Grid); //random(pRNG,x); LatticeColourMatrix x(&Grid); random(pRNG,x);
LatticeColourMatrix y(&Grid); //random(pRNG,y); LatticeColourMatrix y(&Grid); random(pRNG,y);
double start=usecond(); double start=usecond();
for(int i=0;i<Nloop;i++){ for(int64_t i=0;i<Nloop;i++){
z=x*y; z=x*y;
} }
double stop=usecond(); double stop=usecond();
@@ -113,20 +116,20 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=2;lat<=32;lat+=2){ for(int lat=LMIN;lat<=LMAX;lat+=LINC){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeColourMatrix z(&Grid); //random(pRNG,z); LatticeColourMatrix z(&Grid); random(pRNG,z);
LatticeColourMatrix x(&Grid); //random(pRNG,x); LatticeColourMatrix x(&Grid); random(pRNG,x);
LatticeColourMatrix y(&Grid); //random(pRNG,y); LatticeColourMatrix y(&Grid); random(pRNG,y);
double start=usecond(); double start=usecond();
for(int i=0;i<Nloop;i++){ for(int64_t i=0;i<Nloop;i++){
mult(z,x,y); mult(z,x,y);
} }
double stop=usecond(); double stop=usecond();
@@ -144,30 +147,107 @@ int main (int argc, char ** argv)
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl; std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl; std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=2;lat<=32;lat+=2){ for(int lat=LMIN;lat<=LMAX;lat+=LINC){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int64_t vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeColourMatrix z(&Grid); random(pRNG,z);
LatticeColourMatrix x(&Grid); random(pRNG,x);
LatticeColourMatrix y(&Grid); random(pRNG,y);
double start=usecond();
for(int64_t i=0;i<Nloop;i++){
mac(z,x,y);
}
double stop=usecond();
double time = (stop-start)/Nloop*1000.0;
double bytes=3*vol*Nc*Nc*sizeof(Complex);
double flops=Nc*Nc*(6+8+8)*vol;
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t" << flops/time<<std::endl;
}
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking SU3xSU3 CovShiftForward(z,x,y)"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
for(int lat=LMIN;lat<=LMAX;lat+=LINC){
std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]}); std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
int vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3]; int64_t vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
// GridParallelRNG pRNG(&Grid); pRNG.SeedRandomDevice(); GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeColourMatrix z(&Grid); //random(pRNG,z); LatticeColourMatrix z(&Grid); random(pRNG,z);
LatticeColourMatrix x(&Grid); //random(pRNG,x); LatticeColourMatrix x(&Grid); random(pRNG,x);
LatticeColourMatrix y(&Grid); //random(pRNG,y); LatticeColourMatrix y(&Grid); random(pRNG,y);
double start=usecond(); for(int mu=0;mu<4;mu++){
for(int i=0;i<Nloop;i++){ double start=usecond();
mac(z,x,y); for(int64_t i=0;i<Nloop;i++){
z = PeriodicBC::CovShiftForward(x,mu,y);
}
double stop=usecond();
double time = (stop-start)/Nloop*1000.0;
double bytes=3*vol*Nc*Nc*sizeof(Complex);
double flops=Nc*Nc*(6+8+8)*vol;
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t" << flops/time<<std::endl;
} }
double stop=usecond(); }
double time = (stop-start)/Nloop*1000.0; #if 1
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << "= Benchmarking SU3xSU3 z= x * Cshift(y)"<<std::endl;
std::cout<<GridLogMessage << "===================================================================================================="<<std::endl;
std::cout<<GridLogMessage << " L "<<"\t\t"<<"bytes"<<"\t\t\t"<<"GB/s\t\t GFlop/s"<<std::endl;
std::cout<<GridLogMessage << "----------------------------------------------------------"<<std::endl;
double bytes=3*vol*Nc*Nc*sizeof(Complex); for(int lat=LMIN;lat<=LMAX;lat+=LINC){
double flops=Nc*Nc*(8+8+8)*vol; std::vector<int> latt_size ({lat*mpi_layout[0],lat*mpi_layout[1],lat*mpi_layout[2],lat*mpi_layout[3]});
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t" << flops/time<<std::endl; int64_t vol = latt_size[0]*latt_size[1]*latt_size[2]*latt_size[3];
GridCartesian Grid(latt_size,simd_layout,mpi_layout);
GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9}));
LatticeColourMatrix z(&Grid); random(pRNG,z);
LatticeColourMatrix x(&Grid); random(pRNG,x);
LatticeColourMatrix y(&Grid); random(pRNG,y);
LatticeColourMatrix tmp(&Grid);
for(int mu=0;mu<4;mu++){
double tshift=0;
double tmult =0;
double start=usecond();
for(int64_t i=0;i<Nloop;i++){
tshift-=usecond();
tmp = Cshift(y,mu,-1);
tshift+=usecond();
tmult-=usecond();
z = x*tmp;
tmult+=usecond();
}
double stop=usecond();
double time = (stop-start)/Nloop;
tshift = tshift/Nloop;
tmult = tmult /Nloop;
double bytes=3*vol*Nc*Nc*sizeof(Complex);
double flops=Nc*Nc*(6+8+8)*vol;
std::cout<<GridLogMessage<<std::setprecision(3) << "total us "<<time<<" shift "<<tshift <<" mult "<<tmult<<std::endl;
time = time * 1000; // convert to NS for GB/s
std::cout<<GridLogMessage<<std::setprecision(3) << lat<<"\t\t"<<bytes<<" \t\t"<<bytes/time<<"\t\t" << flops/time<<std::endl;
}
} }
#endif
Grid_finalize(); Grid_finalize();
} }

View File

@@ -4,7 +4,7 @@
Source file: ./benchmarks/Benchmark_wilson.cc Source file: ./benchmarks/Benchmark_wilson.cc
Copyright (C) 2015 Copyright (C) 2018
Author: Peter Boyle <paboyle@ph.ed.ac.uk> Author: Peter Boyle <paboyle@ph.ed.ac.uk>
Author: paboyle <paboyle@ph.ed.ac.uk> Author: paboyle <paboyle@ph.ed.ac.uk>
@@ -32,19 +32,23 @@ using namespace std;
using namespace Grid; using namespace Grid;
using namespace Grid::QCD; using namespace Grid::QCD;
#include "Grid/util/Profiling.h"
template<class d> template<class d>
struct scal { struct scal {
d internal; d internal;
}; };
Gamma::GammaMatrix Gmu [] = { Gamma::Algebra Gmu [] = {
Gamma::GammaX, Gamma::Algebra::GammaX,
Gamma::GammaY, Gamma::Algebra::GammaY,
Gamma::GammaZ, Gamma::Algebra::GammaZ,
Gamma::GammaT Gamma::Algebra::GammaT
}; };
bool overlapComms = false; bool overlapComms = false;
bool perfProfiling = false;
int main (int argc, char ** argv) int main (int argc, char ** argv)
{ {
@@ -53,23 +57,34 @@ int main (int argc, char ** argv)
if( GridCmdOptionExists(argv,argv+argc,"--asynch") ){ if( GridCmdOptionExists(argv,argv+argc,"--asynch") ){
overlapComms = true; overlapComms = true;
} }
if( GridCmdOptionExists(argv,argv+argc,"--perf") ){
perfProfiling = true;
}
long unsigned int single_site_flops = 8*QCD::Nc*(7+16*QCD::Nc);
std::vector<int> latt_size = GridDefaultLatt(); std::vector<int> latt_size = GridDefaultLatt();
std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd()); std::vector<int> simd_layout = GridDefaultSimd(Nd,vComplex::Nsimd());
std::vector<int> mpi_layout = GridDefaultMpi(); std::vector<int> mpi_layout = GridDefaultMpi();
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
GridRedBlackCartesian RBGrid(latt_size,simd_layout,mpi_layout); GridRedBlackCartesian RBGrid(&Grid);
int threads = GridThread::GetThreads(); int threads = GridThread::GetThreads();
std::cout<<GridLogMessage << "Grid is setup to use "<<threads<<" threads"<<std::endl;
GridLogLayout();
std::cout<<GridLogMessage << "Grid floating point word size is REALF"<< sizeof(RealF)<<std::endl; std::cout<<GridLogMessage << "Grid floating point word size is REALF"<< sizeof(RealF)<<std::endl;
std::cout<<GridLogMessage << "Grid floating point word size is REALD"<< sizeof(RealD)<<std::endl; std::cout<<GridLogMessage << "Grid floating point word size is REALD"<< sizeof(RealD)<<std::endl;
std::cout<<GridLogMessage << "Grid floating point word size is REAL"<< sizeof(Real)<<std::endl; std::cout<<GridLogMessage << "Grid floating point word size is REAL"<< sizeof(Real)<<std::endl;
std::cout<<GridLogMessage << "Grid number of colours : "<< QCD::Nc <<std::endl;
std::cout<<GridLogMessage << "Benchmarking Wilson operator in the fundamental representation" << std::endl;
std::vector<int> seeds({1,2,3,4}); std::vector<int> seeds({1,2,3,4});
GridParallelRNG pRNG(&Grid); GridParallelRNG pRNG(&Grid);
pRNG.SeedFixedIntegers(seeds); pRNG.SeedFixedIntegers(seeds);
// pRNG.SeedRandomDevice(); // pRNG.SeedFixedIntegers(std::vector<int>({45,12,81,9});
LatticeFermion src (&Grid); random(pRNG,src); LatticeFermion src (&Grid); random(pRNG,src);
LatticeFermion result(&Grid); result=zero; LatticeFermion result(&Grid); result=zero;
@@ -106,7 +121,7 @@ int main (int argc, char ** argv)
{ // Naive wilson implementation { // Naive wilson implementation
ref = zero; ref = zero;
for(int mu=0;mu<Nd;mu++){ for(int mu=0;mu<Nd;mu++){
// ref = src + Gamma(Gamma::GammaX)* src ; // 1-gamma_x // ref = src + Gamma(Gamma::Algebra::GammaX)* src ; // 1-gamma_x
tmp = U[mu]*Cshift(src,mu,1); tmp = U[mu]*Cshift(src,mu,1);
for(int i=0;i<ref._odata.size();i++){ for(int i=0;i<ref._odata.size();i++){
ref._odata[i]+= tmp._odata[i] - Gamma(Gmu[mu])*tmp._odata[i]; ; ref._odata[i]+= tmp._odata[i] - Gamma(Gmu[mu])*tmp._odata[i]; ;
@@ -134,9 +149,25 @@ int main (int argc, char ** argv)
Dw.Dhop(src,result,0); Dw.Dhop(src,result,0);
} }
double t1=usecond(); double t1=usecond();
double flops=1344*volume*ncall; double flops=single_site_flops*volume*ncall;
if (perfProfiling){
std::cout<<GridLogMessage << "Profiling Dw with perf"<<std::endl;
System::profile("kernel", [&]() {
for(int i=0;i<ncall;i++){
Dw.Dhop(src,result,0);
}
});
std::cout<<GridLogMessage << "Generated kernel.data"<<std::endl;
std::cout<<GridLogMessage << "Use with: perf report -i kernel.data"<<std::endl;
}
std::cout<<GridLogMessage << "Called Dw"<<std::endl; std::cout<<GridLogMessage << "Called Dw"<<std::endl;
std::cout<<GridLogMessage << "flops per site " << single_site_flops << std::endl;
std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl; std::cout<<GridLogMessage << "norm result "<< norm2(result)<<std::endl;
std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl; std::cout<<GridLogMessage << "norm ref "<< norm2(ref)<<std::endl;
std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl; std::cout<<GridLogMessage << "mflop/s = "<< flops/(t1-t0)<<std::endl;
@@ -159,7 +190,7 @@ int main (int argc, char ** argv)
ref = zero; ref = zero;
for(int mu=0;mu<Nd;mu++){ for(int mu=0;mu<Nd;mu++){
// ref = src - Gamma(Gamma::GammaX)* src ; // 1+gamma_x // ref = src - Gamma(Gamma::Algebra::GammaX)* src ; // 1+gamma_x
tmp = U[mu]*Cshift(src,mu,1); tmp = U[mu]*Cshift(src,mu,1);
for(int i=0;i<ref._odata.size();i++){ for(int i=0;i<ref._odata.size();i++){
ref._odata[i]+= tmp._odata[i] + Gamma(Gmu[mu])*tmp._odata[i]; ; ref._odata[i]+= tmp._odata[i] + Gamma(Gmu[mu])*tmp._odata[i]; ;

View File

@@ -30,11 +30,11 @@ struct scal {
d internal; d internal;
}; };
Gamma::GammaMatrix Gmu [] = { Gamma::Algebra Gmu [] = {
Gamma::GammaX, Gamma::Algebra::GammaX,
Gamma::GammaY, Gamma::Algebra::GammaY,
Gamma::GammaZ, Gamma::Algebra::GammaZ,
Gamma::GammaT Gamma::Algebra::GammaT
}; };
bool overlapComms = false; bool overlapComms = false;
@@ -62,6 +62,7 @@ int main (int argc, char ** argv)
std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl; std::cout << GridLogMessage<< "* Kernel options --dslash-generic, --dslash-unroll, --dslash-asm" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout << GridLogMessage<< "* Number of colours "<< QCD::Nc <<std::endl;
std::cout << GridLogMessage<< "* Benchmarking WilsonFermionR::Dhop "<<std::endl; std::cout << GridLogMessage<< "* Benchmarking WilsonFermionR::Dhop "<<std::endl;
std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplex::Nsimd()<<std::endl; std::cout << GridLogMessage<< "* Vectorising space-time by "<<vComplex::Nsimd()<<std::endl;
if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl; if ( sizeof(Real)==4 ) std::cout << GridLogMessage<< "* SINGLE precision "<<std::endl;
@@ -69,13 +70,15 @@ int main (int argc, char ** argv)
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptGeneric ) std::cout << GridLogMessage<< "* Using GENERIC Nc WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptHandUnroll) std::cout << GridLogMessage<< "* Using Nc=3 WilsonKernels" <<std::endl;
if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl; if ( WilsonKernelsStatic::Opt == WilsonKernelsStatic::OptInlineAsm ) std::cout << GridLogMessage<< "* Using Asm Nc=3 WilsonKernels" <<std::endl;
std::cout << GridLogMessage << "* OpenMP threads : "<< GridThread::GetThreads() <<std::endl;
std::cout << GridLogMessage << "* MPI tasks : "<< GridCmdVectorIntToString(mpi_layout) << std::endl;
std::cout << GridLogMessage<< "*****************************************************************" <<std::endl; std::cout << GridLogMessage<< "*****************************************************************" <<std::endl;
std::cout<<GridLogMessage << "============================================================================="<< std::endl; std::cout<<GridLogMessage << "================================================================================================="<< std::endl;
std::cout<<GridLogMessage << "= Benchmarking Wilson" << std::endl; std::cout<<GridLogMessage << "= Benchmarking Wilson operator in the fundamental representation" << std::endl;
std::cout<<GridLogMessage << "============================================================================="<< std::endl; std::cout<<GridLogMessage << "================================================================================================="<< std::endl;
std::cout<<GridLogMessage << "Volume\t\t\tWilson/MFLOPs\tWilsonDag/MFLOPs" << std::endl; std::cout<<GridLogMessage << "Volume\t\t\tWilson/MFLOPs\tWilsonDag/MFLOPs\tWilsonEO/MFLOPs\tWilsonDagEO/MFLOPs" << std::endl;
std::cout<<GridLogMessage << "============================================================================="<< std::endl; std::cout<<GridLogMessage << "================================================================================================="<< std::endl;
int Lmax = 32; int Lmax = 32;
int dmin = 0; int dmin = 0;
@@ -93,17 +96,24 @@ int main (int argc, char ** argv)
std::cout << latt_size.back() << "\t\t"; std::cout << latt_size.back() << "\t\t";
GridCartesian Grid(latt_size,simd_layout,mpi_layout); GridCartesian Grid(latt_size,simd_layout,mpi_layout);
GridRedBlackCartesian RBGrid(latt_size,simd_layout,mpi_layout); GridRedBlackCartesian RBGrid(&Grid);
GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(seeds); GridParallelRNG pRNG(&Grid); pRNG.SeedFixedIntegers(seeds);
LatticeGaugeField Umu(&Grid); random(pRNG,Umu); LatticeGaugeField Umu(&Grid); random(pRNG,Umu);
LatticeFermion src(&Grid); random(pRNG,src); LatticeFermion src(&Grid); random(pRNG,src);
LatticeFermion result(&Grid); result=zero; LatticeFermion src_o(&RBGrid); pickCheckerboard(Odd,src_o,src);
LatticeFermion result(&Grid); result=zero;
LatticeFermion result_e(&RBGrid); result_e=zero;
double volume = std::accumulate(latt_size.begin(),latt_size.end(),1,std::multiplies<int>()); double volume = std::accumulate(latt_size.begin(),latt_size.end(),1,std::multiplies<int>());
WilsonFermionR Dw(Umu,Grid,RBGrid,mass,params); WilsonFermionR Dw(Umu,Grid,RBGrid,mass,params);
// Full operator
bench_wilson(src,result,Dw,volume,DaggerNo);
bench_wilson(src,result,Dw,volume,DaggerYes);
std::cout << "\t";
// EO
bench_wilson(src,result,Dw,volume,DaggerNo); bench_wilson(src,result,Dw,volume,DaggerNo);
bench_wilson(src,result,Dw,volume,DaggerYes); bench_wilson(src,result,Dw,volume,DaggerYes);
std::cout << std::endl; std::cout << std::endl;
@@ -122,9 +132,26 @@ void bench_wilson (
int const dag ) int const dag )
{ {
int ncall = 1000; int ncall = 1000;
long unsigned int single_site_flops = 8*QCD::Nc*(7+16*QCD::Nc);
double t0 = usecond(); double t0 = usecond();
for(int i=0; i<ncall; i++) { Dw.Dhop(src,result,dag); } for(int i=0; i<ncall; i++) { Dw.Dhop(src,result,dag); }
double t1 = usecond(); double t1 = usecond();
double flops = 1344 * volume * ncall; double flops = single_site_flops * volume * ncall;
std::cout << flops/(t1-t0) << "\t\t";
}
void bench_wilson_eo (
LatticeFermion & src,
LatticeFermion & result,
WilsonFermionR & Dw,
double const volume,
int const dag )
{
int ncall = 1000;
long unsigned int single_site_flops = 8*QCD::Nc*(7+16*QCD::Nc);
double t0 = usecond();
for(int i=0; i<ncall; i++) { Dw.DhopEO(src,result,dag); }
double t1 = usecond();
double flops = (single_site_flops * volume * ncall)/2.0;
std::cout << flops/(t1-t0) << "\t\t"; std::cout << flops/(t1-t0) << "\t\t";
} }

View File

@@ -1 +1,7 @@
include Make.inc include Make.inc
bench-local: all
./Benchmark_su3
./Benchmark_memory_bandwidth
./Benchmark_wilson
./Benchmark_dwf --dslash-unroll

View File

@@ -0,0 +1,11 @@
#include <Grid/Grid.h>
Grid::vRealD add(const Grid::vRealD &x, const Grid::vRealD &y)
{
return x+y;
}
Grid::vRealD sub(const Grid::vRealD &x, const Grid::vRealD &y)
{
return x-y;
}

View File

@@ -25,7 +25,7 @@ Author: Peter Boyle <paboyle@ph.ed.ac.uk>
See the full license in the file "LICENSE" in the top level distribution directory See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/ *************************************************************************************/
/* END LEGAL */ /* END LEGAL */
#include <Grid.h> #include <Grid/Grid.h>
using namespace std; using namespace std;
using namespace Grid; using namespace Grid;

View File

@@ -25,7 +25,7 @@ Author: Peter Boyle <paboyle@ph.ed.ac.uk>
See the full license in the file "LICENSE" in the top level distribution directory See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/ *************************************************************************************/
/* END LEGAL */ /* END LEGAL */
#include <Grid.h> #include <Grid/Grid.h>
using namespace std; using namespace std;
using namespace Grid; using namespace Grid;

View File

@@ -1,11 +1,9 @@
#!/usr/bin/env bash #!/usr/bin/env bash
EIGEN_URL='http://bitbucket.org/eigen/eigen/get/3.2.9.tar.bz2' EIGEN_URL='http://bitbucket.org/eigen/eigen/get/3.3.5.tar.bz2'
echo "-- deploying Eigen source..." echo "-- deploying Eigen source..."
wget ${EIGEN_URL} --no-check-certificate wget ${EIGEN_URL} --no-check-certificate && ./scripts/update_eigen.sh `basename ${EIGEN_URL}` && rm `basename ${EIGEN_URL}`
./scripts/update_eigen.sh `basename ${EIGEN_URL}`
rm `basename ${EIGEN_URL}`
echo '-- generating Make.inc files...' echo '-- generating Make.inc files...'
./scripts/filelist ./scripts/filelist

View File

@@ -1,16 +1,23 @@
AC_PREREQ([2.63]) AC_PREREQ([2.63])
AC_INIT([Grid], [0.6.0], [https://github.com/paboyle/Grid], [Grid]) AC_INIT([Grid], [0.7.0], [https://github.com/paboyle/Grid], [Grid])
AC_CANONICAL_BUILD AC_CANONICAL_BUILD
AC_CANONICAL_HOST AC_CANONICAL_HOST
AC_CANONICAL_TARGET AC_CANONICAL_TARGET
AM_INIT_AUTOMAKE(subdir-objects) AM_INIT_AUTOMAKE([subdir-objects 1.13])
AM_EXTRA_RECURSIVE_TARGETS([tests bench])
AC_CONFIG_MACRO_DIR([m4]) AC_CONFIG_MACRO_DIR([m4])
AC_CONFIG_SRCDIR([lib/Grid.h]) AC_CONFIG_SRCDIR([lib/Grid.h])
AC_CONFIG_HEADERS([lib/Config.h]) AC_CONFIG_HEADERS([lib/Config.h],[sed -i 's|PACKAGE_|GRID_|' lib/Config.h])
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])]) m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
############### Checks for programs ################ Get git info
#AC_REVISION([m4_esyscmd_s([./scripts/configure.commit])])
################ Set flags
# do not move!
CXXFLAGS="-O3 $CXXFLAGS" CXXFLAGS="-O3 $CXXFLAGS"
############### Checks for programs
AC_PROG_CXX AC_PROG_CXX
AC_PROG_RANLIB AC_PROG_RANLIB
@@ -24,6 +31,8 @@ AX_GXX_VERSION
AC_DEFINE_UNQUOTED([GXX_VERSION],["$GXX_VERSION"], AC_DEFINE_UNQUOTED([GXX_VERSION],["$GXX_VERSION"],
[version of g++ that will compile the code]) [version of g++ that will compile the code])
############### Checks for typedefs, structures, and compiler characteristics ############### Checks for typedefs, structures, and compiler characteristics
AC_TYPE_SIZE_T AC_TYPE_SIZE_T
AC_TYPE_UINT32_T AC_TYPE_UINT32_T
@@ -45,9 +54,14 @@ AC_CHECK_HEADERS(malloc/malloc.h)
AC_CHECK_HEADERS(malloc.h) AC_CHECK_HEADERS(malloc.h)
AC_CHECK_HEADERS(endian.h) AC_CHECK_HEADERS(endian.h)
AC_CHECK_HEADERS(execinfo.h) AC_CHECK_HEADERS(execinfo.h)
AC_CHECK_HEADERS(numaif.h)
AC_CHECK_DECLS([ntohll],[], [], [[#include <arpa/inet.h>]]) AC_CHECK_DECLS([ntohll],[], [], [[#include <arpa/inet.h>]])
AC_CHECK_DECLS([be64toh],[], [], [[#include <arpa/inet.h>]]) AC_CHECK_DECLS([be64toh],[], [], [[#include <arpa/inet.h>]])
############## Standard libraries
AC_CHECK_LIB([m],[cos])
AC_CHECK_LIB([stdc++],[abort])
############### GMP and MPFR ############### GMP and MPFR
AC_ARG_WITH([gmp], AC_ARG_WITH([gmp],
[AS_HELP_STRING([--with-gmp=prefix], [AS_HELP_STRING([--with-gmp=prefix],
@@ -67,6 +81,13 @@ AC_ARG_WITH([fftw],
[AM_CXXFLAGS="-I$with_fftw/include $AM_CXXFLAGS"] [AM_CXXFLAGS="-I$with_fftw/include $AM_CXXFLAGS"]
[AM_LDFLAGS="-L$with_fftw/lib $AM_LDFLAGS"]) [AM_LDFLAGS="-L$with_fftw/lib $AM_LDFLAGS"])
############### LIME
AC_ARG_WITH([lime],
[AS_HELP_STRING([--with-lime=prefix],
[try this for a non-standard install prefix of the LIME library])],
[AM_CXXFLAGS="-I$with_lime/include $AM_CXXFLAGS"]
[AM_LDFLAGS="-L$with_lime/lib $AM_LDFLAGS"])
############### lapack ############### lapack
AC_ARG_ENABLE([lapack], AC_ARG_ENABLE([lapack],
[AC_HELP_STRING([--enable-lapack=yes|no|prefix], [enable LAPACK])], [AC_HELP_STRING([--enable-lapack=yes|no|prefix], [enable LAPACK])],
@@ -83,6 +104,18 @@ case ${ac_LAPACK} in
AC_DEFINE([USE_LAPACK],[1],[use LAPACK]);; AC_DEFINE([USE_LAPACK],[1],[use LAPACK]);;
esac esac
############### FP16 conversions
AC_ARG_ENABLE([sfw-fp16],
[AC_HELP_STRING([--enable-sfw-fp16=yes|no], [enable software fp16 comms])],
[ac_SFW_FP16=${enable_sfw_fp16}], [ac_SFW_FP16=yes])
case ${ac_SFW_FP16} in
yes)
AC_DEFINE([SFW_FP16],[1],[software conversion to fp16]);;
no);;
*)
AC_MSG_ERROR(["SFW FP16 option not supported ${ac_SFW_FP16}"]);;
esac
############### MKL ############### MKL
AC_ARG_ENABLE([mkl], AC_ARG_ENABLE([mkl],
[AC_HELP_STRING([--enable-mkl=yes|no|prefix], [enable Intel MKL for LAPACK & FFTW])], [AC_HELP_STRING([--enable-mkl=yes|no|prefix], [enable Intel MKL for LAPACK & FFTW])],
@@ -99,6 +132,13 @@ case ${ac_MKL} in
AC_DEFINE([USE_MKL], [1], [Define to 1 if you use the Intel MKL]);; AC_DEFINE([USE_MKL], [1], [Define to 1 if you use the Intel MKL]);;
esac esac
############### HDF5
AC_ARG_WITH([hdf5],
[AS_HELP_STRING([--with-hdf5=prefix],
[try this for a non-standard install prefix of the HDF5 library])],
[AM_CXXFLAGS="-I$with_hdf5/include $AM_CXXFLAGS"]
[AM_LDFLAGS="-L$with_hdf5/lib $AM_LDFLAGS"])
############### first-touch ############### first-touch
AC_ARG_ENABLE([numa], AC_ARG_ENABLE([numa],
[AC_HELP_STRING([--enable-numa=yes|no|prefix], [enable first touch numa opt])], [AC_HELP_STRING([--enable-numa=yes|no|prefix], [enable first touch numa opt])],
@@ -145,43 +185,89 @@ AC_SEARCH_LIBS([fftw_execute], [fftw3],
[AC_DEFINE([HAVE_FFTW], [1], [Define to 1 if you have the `FFTW' library])] [AC_DEFINE([HAVE_FFTW], [1], [Define to 1 if you have the `FFTW' library])]
[have_fftw=true]) [have_fftw=true])
AC_SEARCH_LIBS([limeCreateReader], [lime],
[AC_DEFINE([HAVE_LIME], [1], [Define to 1 if you have the `LIME' library])]
[have_lime=true],
[AC_MSG_WARN(C-LIME library was not found in your system.
In order to use ILGG file format please install or provide the correct path to your installation
Info at: http://usqcd.jlab.org/usqcd-docs/c-lime/)])
AC_SEARCH_LIBS([crc32], [z],
[AC_DEFINE([HAVE_ZLIB], [1], [Define to 1 if you have the `LIBZ' library])]
[have_zlib=true] [LIBS="${LIBS} -lz"],
[AC_MSG_ERROR(zlib library was not found in your system.)])
AC_SEARCH_LIBS([move_pages], [numa],
[AC_DEFINE([HAVE_LIBNUMA], [1], [Define to 1 if you have the `LIBNUMA' library])]
[have_libnuma=true] [LIBS="${LIBS} -lnuma"],
[AC_MSG_WARN(libnuma library was not found in your system. Some optimisations will not apply)])
AC_SEARCH_LIBS([H5Fopen], [hdf5_cpp],
[AC_DEFINE([HAVE_HDF5], [1], [Define to 1 if you have the `HDF5' library])]
[have_hdf5=true]
[LIBS="${LIBS} -lhdf5"], [], [-lhdf5])
AM_CONDITIONAL(BUILD_HDF5, [ test "${have_hdf5}X" == "trueX" ])
CXXFLAGS=$CXXFLAGS_CPY CXXFLAGS=$CXXFLAGS_CPY
LDFLAGS=$LDFLAGS_CPY LDFLAGS=$LDFLAGS_CPY
############### SIMD instruction selection ############### SIMD instruction selection
AC_ARG_ENABLE([simd],[AC_HELP_STRING([--enable-simd=<code>], AC_ARG_ENABLE([simd],[AC_HELP_STRING([--enable-simd=code],
[select SIMD target (cf. README.md)])], [ac_SIMD=${enable_simd}], [ac_SIMD=GEN]) [select SIMD target (cf. README.md)])], [ac_SIMD=${enable_simd}], [ac_SIMD=GEN])
AC_ARG_ENABLE([gen-simd-width],
[AS_HELP_STRING([--enable-gen-simd-width=size],
[size (in bytes) of the generic SIMD vectors (default: 32)])],
[ac_gen_simd_width=$enable_gen_simd_width],
[ac_gen_simd_width=32])
case ${ax_cv_cxx_compiler_vendor} in case ${ax_cv_cxx_compiler_vendor} in
clang|gnu) clang|gnu)
case ${ac_SIMD} in case ${ac_SIMD} in
SSE4) SSE4)
AC_DEFINE([SSE4],[1],[SSE4 intrinsics]) AC_DEFINE([SSE4],[1],[SSE4 intrinsics])
SIMD_FLAGS='-msse4.2';; case ${ac_SFW_FP16} in
yes)
SIMD_FLAGS='-msse4.2';;
no)
SIMD_FLAGS='-msse4.2 -mf16c';;
*)
AC_MSG_ERROR(["SFW_FP16 must be either yes or no value ${ac_SFW_FP16} "]);;
esac;;
AVX) AVX)
AC_DEFINE([AVX1],[1],[AVX intrinsics]) AC_DEFINE([AVX1],[1],[AVX intrinsics])
SIMD_FLAGS='-mavx';; SIMD_FLAGS='-mavx -mf16c';;
AVXFMA4) AVXFMA4)
AC_DEFINE([AVXFMA4],[1],[AVX intrinsics with FMA4]) AC_DEFINE([AVXFMA4],[1],[AVX intrinsics with FMA4])
SIMD_FLAGS='-mavx -mfma4';; SIMD_FLAGS='-mavx -mfma4 -mf16c';;
AVXFMA) AVXFMA)
AC_DEFINE([AVXFMA],[1],[AVX intrinsics with FMA3]) AC_DEFINE([AVXFMA],[1],[AVX intrinsics with FMA3])
SIMD_FLAGS='-mavx -mfma';; SIMD_FLAGS='-mavx -mfma -mf16c';;
AVX2) AVX2)
AC_DEFINE([AVX2],[1],[AVX2 intrinsics]) AC_DEFINE([AVX2],[1],[AVX2 intrinsics])
SIMD_FLAGS='-mavx2 -mfma';; SIMD_FLAGS='-mavx2 -mfma -mf16c';;
AVX512) AVX512)
AC_DEFINE([AVX512],[1],[AVX512 intrinsics]) AC_DEFINE([AVX512],[1],[AVX512 intrinsics])
SIMD_FLAGS='-mavx512f -mavx512pf -mavx512er -mavx512cd';; SIMD_FLAGS='-mavx512f -mavx512pf -mavx512er -mavx512cd';;
SKL)
AC_DEFINE([AVX512],[1],[AVX512 intrinsics for SkyLake Xeon])
SIMD_FLAGS='-march=skylake-avx512';;
KNC) KNC)
AC_DEFINE([IMCI],[1],[IMCI intrinsics for Knights Corner]) AC_DEFINE([IMCI],[1],[IMCI intrinsics for Knights Corner])
SIMD_FLAGS='';; SIMD_FLAGS='';;
KNL) KNL)
AC_DEFINE([AVX512],[1],[AVX512 intrinsics]) AC_DEFINE([AVX512],[1],[AVX512 intrinsics])
AC_DEFINE([KNL],[1],[Knights landing processor])
SIMD_FLAGS='-march=knl';; SIMD_FLAGS='-march=knl';;
GEN) GEN)
AC_DEFINE([GENERIC_VEC],[1],[generic vector code]) AC_DEFINE([GEN],[1],[generic vector code])
AC_DEFINE_UNQUOTED([GEN_SIMD_WIDTH],[$ac_gen_simd_width],
[generic SIMD vector width (in bytes)])
SIMD_GEN_WIDTH_MSG=" (width= $ac_gen_simd_width)"
SIMD_FLAGS='';; SIMD_FLAGS='';;
NEONv8)
AC_DEFINE([NEONV8],[1],[ARMv8 NEON])
SIMD_FLAGS='-march=armv8-a';;
QPX|BGQ) QPX|BGQ)
AC_DEFINE([QPX],[1],[QPX intrinsics for BG/Q]) AC_DEFINE([QPX],[1],[QPX intrinsics for BG/Q])
SIMD_FLAGS='';; SIMD_FLAGS='';;
@@ -197,8 +283,8 @@ case ${ax_cv_cxx_compiler_vendor} in
AC_DEFINE([AVX1],[1],[AVX intrinsics]) AC_DEFINE([AVX1],[1],[AVX intrinsics])
SIMD_FLAGS='-mavx -xavx';; SIMD_FLAGS='-mavx -xavx';;
AVXFMA) AVXFMA)
AC_DEFINE([AVXFMA],[1],[AVX intrinsics with FMA4]) AC_DEFINE([AVXFMA],[1],[AVX intrinsics with FMA3])
SIMD_FLAGS='-mavx -mfma';; SIMD_FLAGS='-mavx -fma';;
AVX2) AVX2)
AC_DEFINE([AVX2],[1],[AVX2 intrinsics]) AC_DEFINE([AVX2],[1],[AVX2 intrinsics])
SIMD_FLAGS='-march=core-avx2 -xcore-avx2';; SIMD_FLAGS='-march=core-avx2 -xcore-avx2';;
@@ -210,9 +296,13 @@ case ${ax_cv_cxx_compiler_vendor} in
SIMD_FLAGS='';; SIMD_FLAGS='';;
KNL) KNL)
AC_DEFINE([AVX512],[1],[AVX512 intrinsics for Knights Landing]) AC_DEFINE([AVX512],[1],[AVX512 intrinsics for Knights Landing])
AC_DEFINE([KNL],[1],[Knights landing processor])
SIMD_FLAGS='-xmic-avx512';; SIMD_FLAGS='-xmic-avx512';;
GEN) GEN)
AC_DEFINE([GENERIC_VEC],[1],[generic vector code]) AC_DEFINE([GEN],[1],[generic vector code])
AC_DEFINE_UNQUOTED([GEN_SIMD_WIDTH],[$ac_gen_simd_width],
[generic SIMD vector width (in bytes)])
SIMD_GEN_WIDTH_MSG=" (width= $ac_gen_simd_width)"
SIMD_FLAGS='';; SIMD_FLAGS='';;
*) *)
AC_MSG_ERROR(["SIMD option ${ac_SIMD} not supported by the Intel compiler"]);; AC_MSG_ERROR(["SIMD option ${ac_SIMD} not supported by the Intel compiler"]);;
@@ -244,10 +334,47 @@ case ${ac_PRECISION} in
double) double)
AC_DEFINE([GRID_DEFAULT_PRECISION_DOUBLE],[1],[GRID_DEFAULT_PRECISION is DOUBLE] ) AC_DEFINE([GRID_DEFAULT_PRECISION_DOUBLE],[1],[GRID_DEFAULT_PRECISION is DOUBLE] )
;; ;;
*)
AC_MSG_ERROR([${ac_PRECISION} unsupported --enable-precision option]);
;;
esac esac
###################### Shared memory allocation technique under MPI3
AC_ARG_ENABLE([shm],[AC_HELP_STRING([--enable-shm=shmopen|shmget|hugetlbfs|shmnone],
[Select SHM allocation technique])],[ac_SHM=${enable_shm}],[ac_SHM=shmopen])
case ${ac_SHM} in
shmopen)
AC_DEFINE([GRID_MPI3_SHMOPEN],[1],[GRID_MPI3_SHMOPEN] )
;;
shmget)
AC_DEFINE([GRID_MPI3_SHMGET],[1],[GRID_MPI3_SHMGET] )
;;
shmnone)
AC_DEFINE([GRID_MPI3_SHM_NONE],[1],[GRID_MPI3_SHM_NONE] )
;;
hugetlbfs)
AC_DEFINE([GRID_MPI3_SHMMMAP],[1],[GRID_MPI3_SHMMMAP] )
;;
*)
AC_MSG_ERROR([${ac_SHM} unsupported --enable-shm option]);
;;
esac
###################### Shared base path for SHMMMAP
AC_ARG_ENABLE([shmpath],[AC_HELP_STRING([--enable-shmpath=path],
[Select SHM mmap base path for hugetlbfs])],
[ac_SHMPATH=${enable_shmpath}],
[ac_SHMPATH=/var/lib/hugetlbfs/global/pagesize-2MB/])
AC_DEFINE_UNQUOTED([GRID_SHM_PATH],["$ac_SHMPATH"],[Path to a hugetlbfs filesystem for MMAPing])
############### communication type selection ############### communication type selection
AC_ARG_ENABLE([comms],[AC_HELP_STRING([--enable-comms=none|mpi|mpi-auto|mpi3|mpi3-auto|shmem], AC_ARG_ENABLE([comms],[AC_HELP_STRING([--enable-comms=none|mpi|mpi-auto],
[Select communications])],[ac_COMMS=${enable_comms}],[ac_COMMS=none]) [Select communications])],[ac_COMMS=${enable_comms}],[ac_COMMS=none])
case ${ac_COMMS} in case ${ac_COMMS} in
@@ -255,22 +382,10 @@ case ${ac_COMMS} in
AC_DEFINE([GRID_COMMS_NONE],[1],[GRID_COMMS_NONE] ) AC_DEFINE([GRID_COMMS_NONE],[1],[GRID_COMMS_NONE] )
comms_type='none' comms_type='none'
;; ;;
mpi3l*) mpi*)
AC_DEFINE([GRID_COMMS_MPI3L],[1],[GRID_COMMS_MPI3L] )
comms_type='mpi3l'
;;
mpi3*)
AC_DEFINE([GRID_COMMS_MPI3],[1],[GRID_COMMS_MPI3] ) AC_DEFINE([GRID_COMMS_MPI3],[1],[GRID_COMMS_MPI3] )
comms_type='mpi3' comms_type='mpi3'
;; ;;
mpi*)
AC_DEFINE([GRID_COMMS_MPI],[1],[GRID_COMMS_MPI] )
comms_type='mpi'
;;
shmem)
AC_DEFINE([GRID_COMMS_SHMEM],[1],[GRID_COMMS_SHMEM] )
comms_type='shmem'
;;
*) *)
AC_MSG_ERROR([${ac_COMMS} unsupported --enable-comms option]); AC_MSG_ERROR([${ac_COMMS} unsupported --enable-comms option]);
;; ;;
@@ -278,7 +393,7 @@ esac
case ${ac_COMMS} in case ${ac_COMMS} in
*-auto) *-auto)
LX_FIND_MPI LX_FIND_MPI
if test "x$have_CXX_mpi" = 'xno'; then AC_MSG_ERROR(["MPI not found"]); fi if test "x$have_CXX_mpi" = 'xno'; then AC_MSG_ERROR(["The configure could not find the MPI compilation flags. N.B. The -auto mode is not supported by Cray wrappers. Use the non -auto version in this case."]); fi
AM_CXXFLAGS="$MPI_CXXFLAGS $AM_CXXFLAGS" AM_CXXFLAGS="$MPI_CXXFLAGS $AM_CXXFLAGS"
AM_CFLAGS="$MPI_CFLAGS $AM_CFLAGS" AM_CFLAGS="$MPI_CFLAGS $AM_CFLAGS"
AM_LDFLAGS="`echo $MPI_CXXLDFLAGS | sed -E 's/-l@<:@^ @:>@+//g'` $AM_LDFLAGS" AM_LDFLAGS="`echo $MPI_CXXLDFLAGS | sed -E 's/-l@<:@^ @:>@+//g'` $AM_LDFLAGS"
@@ -290,13 +405,13 @@ esac
AM_CONDITIONAL(BUILD_COMMS_SHMEM, [ test "${comms_type}X" == "shmemX" ]) AM_CONDITIONAL(BUILD_COMMS_SHMEM, [ test "${comms_type}X" == "shmemX" ])
AM_CONDITIONAL(BUILD_COMMS_MPI, [ test "${comms_type}X" == "mpiX" ]) AM_CONDITIONAL(BUILD_COMMS_MPI, [ test "${comms_type}X" == "mpiX" ])
AM_CONDITIONAL(BUILD_COMMS_MPI3, [ test "${comms_type}X" == "mpi3X" ] ) AM_CONDITIONAL(BUILD_COMMS_MPI3, [ test "${comms_type}X" == "mpi3X" ] )
AM_CONDITIONAL(BUILD_COMMS_MPI3L, [ test "${comms_type}X" == "mpi3lX" ] ) AM_CONDITIONAL(BUILD_COMMS_MPIT, [ test "${comms_type}X" == "mpitX" ] )
AM_CONDITIONAL(BUILD_COMMS_NONE, [ test "${comms_type}X" == "noneX" ]) AM_CONDITIONAL(BUILD_COMMS_NONE, [ test "${comms_type}X" == "noneX" ])
############### RNG selection ############### RNG selection
AC_ARG_ENABLE([rng],[AC_HELP_STRING([--enable-rng=ranlux48|mt19937],\ AC_ARG_ENABLE([rng],[AC_HELP_STRING([--enable-rng=ranlux48|mt19937|sitmo],\
[Select Random Number Generator to be used])],\ [Select Random Number Generator to be used])],\
[ac_RNG=${enable_rng}],[ac_RNG=ranlux48]) [ac_RNG=${enable_rng}],[ac_RNG=sitmo])
case ${ac_RNG} in case ${ac_RNG} in
ranlux48) ranlux48)
@@ -305,6 +420,9 @@ case ${ac_RNG} in
mt19937) mt19937)
AC_DEFINE([RNG_MT19937],[1],[RNG_MT19937] ) AC_DEFINE([RNG_MT19937],[1],[RNG_MT19937] )
;; ;;
sitmo)
AC_DEFINE([RNG_SITMO],[1],[RNG_SITMO] )
;;
*) *)
AC_MSG_ERROR([${ac_RNG} unsupported --enable-rng option]); AC_MSG_ERROR([${ac_RNG} unsupported --enable-rng option]);
;; ;;
@@ -342,38 +460,45 @@ esac
AM_CONDITIONAL(BUILD_CHROMA_REGRESSION,[ test "X${ac_CHROMA}X" == "XyesX" ]) AM_CONDITIONAL(BUILD_CHROMA_REGRESSION,[ test "X${ac_CHROMA}X" == "XyesX" ])
############### Doxygen ############### Doxygen
AC_PROG_DOXYGEN DX_DOXYGEN_FEATURE([OFF])
DX_DOT_FEATURE([OFF])
if test -n "$DOXYGEN" DX_HTML_FEATURE([ON])
then DX_CHM_FEATURE([OFF])
AC_CONFIG_FILES([docs/doxy.cfg]) DX_CHI_FEATURE([OFF])
fi DX_MAN_FEATURE([OFF])
DX_RTF_FEATURE([OFF])
DX_XML_FEATURE([OFF])
DX_PDF_FEATURE([OFF])
DX_PS_FEATURE([OFF])
DX_INIT_DOXYGEN([$PACKAGE_NAME], [doxygen.cfg])
############### Ouput ############### Ouput
cwd=`pwd -P`; cd ${srcdir}; abs_srcdir=`pwd -P`; cd ${cwd} cwd=`pwd -P`; cd ${srcdir}; abs_srcdir=`pwd -P`; cd ${cwd}
AM_CXXFLAGS="-I${abs_srcdir}/include $AM_CXXFLAGS" GRID_CXXFLAGS="$AM_CXXFLAGS $CXXFLAGS"
AM_CFLAGS="-I${abs_srcdir}/include $AM_CFLAGS" GRID_LDFLAGS="$AM_LDFLAGS $LDFLAGS"
GRID_LIBS=$LIBS
GRID_SHORT_SHA=`git rev-parse --short HEAD`
GRID_SHA=`git rev-parse HEAD`
GRID_BRANCH=`git rev-parse --abbrev-ref HEAD`
AM_CXXFLAGS="-I${abs_srcdir}/include -I${abs_srcdir}/Eigen/ -I${abs_srcdir}/Eigen/unsupported $AM_CXXFLAGS"
AM_CFLAGS="-I${abs_srcdir}/include -I${abs_srcdir}/Eigen/ -I${abs_srcdir}/Eigen/unsupported $AM_CFLAGS"
AM_LDFLAGS="-L${cwd}/lib $AM_LDFLAGS" AM_LDFLAGS="-L${cwd}/lib $AM_LDFLAGS"
AC_SUBST([AM_CFLAGS]) AC_SUBST([AM_CFLAGS])
AC_SUBST([AM_CXXFLAGS]) AC_SUBST([AM_CXXFLAGS])
AC_SUBST([AM_LDFLAGS]) AC_SUBST([AM_LDFLAGS])
AC_CONFIG_FILES(Makefile) AC_SUBST([GRID_CXXFLAGS])
AC_CONFIG_FILES(lib/Makefile) AC_SUBST([GRID_LDFLAGS])
AC_CONFIG_FILES(tests/Makefile) AC_SUBST([GRID_LIBS])
AC_CONFIG_FILES(tests/IO/Makefile) AC_SUBST([GRID_SHA])
AC_CONFIG_FILES(tests/core/Makefile) AC_SUBST([GRID_BRANCH])
AC_CONFIG_FILES(tests/debug/Makefile)
AC_CONFIG_FILES(tests/forces/Makefile) git_commit=`cd $srcdir && ./scripts/configure.commit`
AC_CONFIG_FILES(tests/hmc/Makefile)
AC_CONFIG_FILES(tests/solver/Makefile)
AC_CONFIG_FILES(tests/qdpxx/Makefile)
AC_CONFIG_FILES(benchmarks/Makefile)
AC_OUTPUT
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Summary of configuration for $PACKAGE v$VERSION Summary of configuration for $PACKAGE v$VERSION
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
----- GIT VERSION -------------------------------------
$git_commit
----- PLATFORM ---------------------------------------- ----- PLATFORM ----------------------------------------
architecture (build) : $build_cpu architecture (build) : $build_cpu
os (build) : $build_os os (build) : $build_os
@@ -382,16 +507,20 @@ os (target) : $target_os
compiler vendor : ${ax_cv_cxx_compiler_vendor} compiler vendor : ${ax_cv_cxx_compiler_vendor}
compiler version : ${ax_cv_gxx_version} compiler version : ${ax_cv_gxx_version}
----- BUILD OPTIONS ----------------------------------- ----- BUILD OPTIONS -----------------------------------
SIMD : ${ac_SIMD} SIMD : ${ac_SIMD}${SIMD_GEN_WIDTH_MSG}
Threading : ${ac_openmp} Threading : ${ac_openmp}
Communications type : ${comms_type} Communications type : ${comms_type}
Shared memory allocator : ${ac_SHM}
Shared memory mmap path : ${ac_SHMPATH}
Default precision : ${ac_PRECISION} Default precision : ${ac_PRECISION}
Software FP16 conversion : ${ac_SFW_FP16}
RNG choice : ${ac_RNG} RNG choice : ${ac_RNG}
GMP : `if test "x$have_gmp" = xtrue; then echo yes; else echo no; fi` GMP : `if test "x$have_gmp" = xtrue; then echo yes; else echo no; fi`
LAPACK : ${ac_LAPACK} LAPACK : ${ac_LAPACK}
FFTW : `if test "x$have_fftw" = xtrue; then echo yes; else echo no; fi` FFTW : `if test "x$have_fftw" = xtrue; then echo yes; else echo no; fi`
build DOXYGEN documentation : `if test "x$enable_doc" = xyes; then echo yes; else echo no; fi` LIME (ILDG support) : `if test "x$have_lime" = xtrue; then echo yes; else echo no; fi`
graphs and diagrams : `if test "x$enable_dot" = xyes; then echo yes; else echo no; fi` HDF5 : `if test "x$have_hdf5" = xtrue; then echo yes; else echo no; fi`
build DOXYGEN documentation : `if test "$DX_FLAG_doc" = '1'; then echo yes; else echo no; fi`
----- BUILD FLAGS ------------------------------------- ----- BUILD FLAGS -------------------------------------
CXXFLAGS: CXXFLAGS:
`echo ${AM_CXXFLAGS} ${CXXFLAGS} | tr ' ' '\n' | sed 's/^-/ -/g'` `echo ${AM_CXXFLAGS} ${CXXFLAGS} | tr ' ' '\n' | sed 's/^-/ -/g'`
@@ -399,7 +528,33 @@ LDFLAGS:
`echo ${AM_LDFLAGS} ${LDFLAGS} | tr ' ' '\n' | sed 's/^-/ -/g'` `echo ${AM_LDFLAGS} ${LDFLAGS} | tr ' ' '\n' | sed 's/^-/ -/g'`
LIBS: LIBS:
`echo ${LIBS} | tr ' ' '\n' | sed 's/^-/ -/g'` `echo ${LIBS} | tr ' ' '\n' | sed 's/^-/ -/g'`
-------------------------------------------------------" > config.summary -------------------------------------------------------" > grid.configure.summary
GRID_SUMMARY="`cat grid.configure.summary`"
AM_SUBST_NOTMAKE([GRID_SUMMARY])
AC_SUBST([GRID_SUMMARY])
AC_CONFIG_FILES([grid-config], [chmod +x grid-config])
AC_CONFIG_FILES(Makefile)
AC_CONFIG_FILES(lib/Makefile)
AC_CONFIG_FILES(tests/Makefile)
AC_CONFIG_FILES(tests/IO/Makefile)
AC_CONFIG_FILES(tests/core/Makefile)
AC_CONFIG_FILES(tests/debug/Makefile)
AC_CONFIG_FILES(tests/forces/Makefile)
AC_CONFIG_FILES(tests/hadrons/Makefile)
AC_CONFIG_FILES(tests/hmc/Makefile)
AC_CONFIG_FILES(tests/solver/Makefile)
AC_CONFIG_FILES(tests/lanczos/Makefile)
AC_CONFIG_FILES(tests/smearing/Makefile)
AC_CONFIG_FILES(tests/qdpxx/Makefile)
AC_CONFIG_FILES(tests/testu01/Makefile)
AC_CONFIG_FILES(benchmarks/Makefile)
AC_CONFIG_FILES(extras/Makefile)
AC_CONFIG_FILES(extras/Hadrons/Makefile)
AC_OUTPUT
echo "" echo ""
cat config.summary cat grid.configure.summary
echo "" echo ""

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

184
doxygen.inc Normal file
View File

@@ -0,0 +1,184 @@
# Copyright (C) 2004 Oren Ben-Kiki
# This file is distributed under the same terms as the Automake macro files.
# Generate automatic documentation using Doxygen. Goals and variables values
# are controlled by the various DX_COND_??? conditionals set by autoconf.
#
# The provided goals are:
# doxygen-doc: Generate all doxygen documentation.
# doxygen-run: Run doxygen, which will generate some of the documentation
# (HTML, CHM, CHI, MAN, RTF, XML) but will not do the post
# processing required for the rest of it (PS, PDF, and some MAN).
# doxygen-man: Rename some doxygen generated man pages.
# doxygen-ps: Generate doxygen PostScript documentation.
# doxygen-pdf: Generate doxygen PDF documentation.
#
# Note that by default these are not integrated into the automake goals. If
# doxygen is used to generate man pages, you can achieve this integration by
# setting man3_MANS to the list of man pages generated and then adding the
# dependency:
#
# $(man3_MANS): doxygen-doc
#
# This will cause make to run doxygen and generate all the documentation.
#
# The following variable is intended for use in Makefile.am:
#
# DX_CLEANFILES = everything to clean.
#
# This is usually added to MOSTLYCLEANFILES.
## --------------------------------- ##
## Format-independent Doxygen rules. ##
## --------------------------------- ##
if DX_COND_doc
## ------------------------------- ##
## Rules specific for HTML output. ##
## ------------------------------- ##
if DX_COND_html
DX_CLEAN_HTML = @DX_DOCDIR@/html
endif DX_COND_html
## ------------------------------ ##
## Rules specific for CHM output. ##
## ------------------------------ ##
if DX_COND_chm
DX_CLEAN_CHM = @DX_DOCDIR@/chm
if DX_COND_chi
DX_CLEAN_CHI = @DX_DOCDIR@/@PACKAGE@.chi
endif DX_COND_chi
endif DX_COND_chm
## ------------------------------ ##
## Rules specific for MAN output. ##
## ------------------------------ ##
if DX_COND_man
DX_CLEAN_MAN = @DX_DOCDIR@/man
endif DX_COND_man
## ------------------------------ ##
## Rules specific for RTF output. ##
## ------------------------------ ##
if DX_COND_rtf
DX_CLEAN_RTF = @DX_DOCDIR@/rtf
endif DX_COND_rtf
## ------------------------------ ##
## Rules specific for XML output. ##
## ------------------------------ ##
if DX_COND_xml
DX_CLEAN_XML = @DX_DOCDIR@/xml
endif DX_COND_xml
## ----------------------------- ##
## Rules specific for PS output. ##
## ----------------------------- ##
if DX_COND_ps
DX_CLEAN_PS = @DX_DOCDIR@/@PACKAGE@.ps
DX_PS_GOAL = doxygen-ps
doxygen-ps: @DX_DOCDIR@/@PACKAGE@.ps
@DX_DOCDIR@/@PACKAGE@.ps: @DX_DOCDIR@/@PACKAGE@.tag
cd @DX_DOCDIR@/latex; \
rm -f *.aux *.toc *.idx *.ind *.ilg *.log *.out; \
$(DX_LATEX) refman.tex; \
$(MAKEINDEX_PATH) refman.idx; \
$(DX_LATEX) refman.tex; \
countdown=5; \
while $(DX_EGREP) 'Rerun (LaTeX|to get cross-references right)' \
refman.log > /dev/null 2>&1 \
&& test $$countdown -gt 0; do \
$(DX_LATEX) refman.tex; \
countdown=`expr $$countdown - 1`; \
done; \
$(DX_DVIPS) -o ../@PACKAGE@.ps refman.dvi
endif DX_COND_ps
## ------------------------------ ##
## Rules specific for PDF output. ##
## ------------------------------ ##
if DX_COND_pdf
DX_CLEAN_PDF = @DX_DOCDIR@/@PACKAGE@.pdf
DX_PDF_GOAL = doxygen-pdf
doxygen-pdf: @DX_DOCDIR@/@PACKAGE@.pdf
@DX_DOCDIR@/@PACKAGE@.pdf: @DX_DOCDIR@/@PACKAGE@.tag
cd @DX_DOCDIR@/latex; \
rm -f *.aux *.toc *.idx *.ind *.ilg *.log *.out; \
$(DX_PDFLATEX) refman.tex; \
$(DX_MAKEINDEX) refman.idx; \
$(DX_PDFLATEX) refman.tex; \
countdown=5; \
while $(DX_EGREP) 'Rerun (LaTeX|to get cross-references right)' \
refman.log > /dev/null 2>&1 \
&& test $$countdown -gt 0; do \
$(DX_PDFLATEX) refman.tex; \
countdown=`expr $$countdown - 1`; \
done; \
mv refman.pdf ../@PACKAGE@.pdf
endif DX_COND_pdf
## ------------------------------------------------- ##
## Rules specific for LaTeX (shared for PS and PDF). ##
## ------------------------------------------------- ##
if DX_COND_latex
DX_CLEAN_LATEX = @DX_DOCDIR@/latex
endif DX_COND_latex
.INTERMEDIATE: doxygen-run $(DX_PS_GOAL) $(DX_PDF_GOAL)
doxygen-run: @DX_DOCDIR@/@PACKAGE@.tag
doxygen-doc: doxygen-run $(DX_PS_GOAL) $(DX_PDF_GOAL)
@DX_DOCDIR@/@PACKAGE@.tag: $(DX_CONFIG) $(pkginclude_HEADERS)
rm -rf @DX_DOCDIR@
$(DX_ENV) $(DX_DOXYGEN) $(srcdir)/$(DX_CONFIG)
DX_CLEANFILES = \
@DX_DOCDIR@/@PACKAGE@.tag \
-r \
$(DX_CLEAN_HTML) \
$(DX_CLEAN_CHM) \
$(DX_CLEAN_CHI) \
$(DX_CLEAN_MAN) \
$(DX_CLEAN_RTF) \
$(DX_CLEAN_XML) \
$(DX_CLEAN_PS) \
$(DX_CLEAN_PDF) \
$(DX_CLEAN_LATEX)
endif DX_COND_doc

View File

@@ -0,0 +1,146 @@
#ifndef A2A_Reduction_hpp_
#define A2A_Reduction_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Environment.hpp>
#include <Grid/Hadrons/Solver.hpp>
BEGIN_HADRONS_NAMESPACE
////////////////////////////////////////////
// A2A Meson Field Inner Product
////////////////////////////////////////////
template <class FermionField>
void sliceInnerProductMesonField(std::vector<std::vector<ComplexD>> &mat,
const std::vector<Lattice<FermionField>> &lhs,
const std::vector<Lattice<FermionField>> &rhs,
int orthogdim)
{
typedef typename FermionField::scalar_type scalar_type;
typedef typename FermionField::vector_type vector_type;
int Lblock = lhs.size();
int Rblock = rhs.size();
GridBase *grid = lhs[0]._grid;
const int Nd = grid->_ndimension;
const int Nsimd = grid->Nsimd();
int Nt = grid->GlobalDimensions()[orthogdim];
assert(mat.size() == Lblock * Rblock);
for (int t = 0; t < mat.size(); t++)
{
assert(mat[t].size() == Nt);
}
int fd = grid->_fdimensions[orthogdim];
int ld = grid->_ldimensions[orthogdim];
int rd = grid->_rdimensions[orthogdim];
// will locally sum vectors first
// sum across these down to scalars
// splitting the SIMD
std::vector<vector_type, alignedAllocator<vector_type>> lvSum(rd * Lblock * Rblock);
for(int r=0;r<rd * Lblock * Rblock;r++)
{
lvSum[r]=zero;
}
std::vector<scalar_type> lsSum(ld * Lblock * Rblock, scalar_type(0.0));
int e1 = grid->_slice_nblock[orthogdim];
int e2 = grid->_slice_block[orthogdim];
int stride = grid->_slice_stride[orthogdim];
// std::cout << GridLogMessage << " Entering first parallel loop " << std::endl;
// Parallelise over t-direction doesn't expose as much parallelism as needed for KNL
parallel_for(int r = 0; r < rd; r++)
{
int so = r * grid->_ostride[orthogdim]; // base offset for start of plane
for (int n = 0; n < e1; n++)
{
for (int b = 0; b < e2; b++)
{
int ss = so + n * stride + b;
for (int i = 0; i < Lblock; i++)
{
auto left = conjugate(lhs[i]._odata[ss]);
for (int j = 0; j < Rblock; j++)
{
int idx = i + Lblock * j + Lblock * Rblock * r;
auto right = rhs[j]._odata[ss];
vector_type vv = left()(0)(0) * right()(0)(0)
+ left()(0)(1) * right()(0)(1)
+ left()(0)(2) * right()(0)(2)
+ left()(1)(0) * right()(1)(0)
+ left()(1)(1) * right()(1)(1)
+ left()(1)(2) * right()(1)(2)
+ left()(2)(0) * right()(2)(0)
+ left()(2)(1) * right()(2)(1)
+ left()(2)(2) * right()(2)(2)
+ left()(3)(0) * right()(3)(0)
+ left()(3)(1) * right()(3)(1)
+ left()(3)(2) * right()(3)(2);
lvSum[idx] = lvSum[idx] + vv;
}
}
}
}
}
// std::cout << GridLogMessage << " Entering second parallel loop " << std::endl;
// Sum across simd lanes in the plane, breaking out orthog dir.
parallel_for(int rt = 0; rt < rd; rt++)
{
std::vector<int> icoor(Nd);
for (int i = 0; i < Lblock; i++)
{
for (int j = 0; j < Rblock; j++)
{
iScalar<vector_type> temp;
std::vector<iScalar<scalar_type>> extracted(Nsimd);
temp._internal = lvSum[i + Lblock * j + Lblock * Rblock * rt];
extract(temp, extracted);
for (int idx = 0; idx < Nsimd; idx++)
{
grid->iCoorFromIindex(icoor, idx);
int ldx = rt + icoor[orthogdim] * rd;
int ij_dx = i + Lblock * j + Lblock * Rblock * ldx;
lsSum[ij_dx] = lsSum[ij_dx] + extracted[idx]._internal;
}
}
}
}
// std::cout << GridLogMessage << " Entering non parallel loop " << std::endl;
for (int t = 0; t < fd; t++)
{
int pt = t/ld; // processor plane
int lt = t%ld;
for (int i = 0; i < Lblock; i++)
{
for (int j = 0; j < Rblock; j++)
{
if (pt == grid->_processor_coor[orthogdim])
{
int ij_dx = i + Lblock * j + Lblock * Rblock * lt;
mat[i + j * Lblock][t] = lsSum[ij_dx];
}
else
{
mat[i + j * Lblock][t] = scalar_type(0.0);
}
}
}
}
// std::cout << GridLogMessage << " Done " << std::endl;
// defer sum over nodes.
return;
}
END_HADRONS_NAMESPACE
#endif // A2A_Reduction_hpp_

View File

@@ -0,0 +1,210 @@
#ifndef A2A_Vectors_hpp_
#define A2A_Vectors_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Environment.hpp>
#include <Grid/Hadrons/Solver.hpp>
BEGIN_HADRONS_NAMESPACE
////////////////////////////////
// A2A Modes
////////////////////////////////
template <class Field, class Matrix, class Solver>
class A2AModesSchurDiagTwo
{
private:
const std::vector<Field> *evec;
const std::vector<RealD> *eval;
Matrix &action;
Solver &solver;
std::vector<Field> w_high_5d, v_high_5d, w_high_4d, v_high_4d;
const int Nl, Nh;
const bool return_5d;
public:
int getNl (void ) {return Nl;}
int getNh (void ) {return Nh;}
int getN (void ) {return Nh+Nl;}
A2AModesSchurDiagTwo(const std::vector<Field> *_evec, const std::vector<RealD> *_eval,
Matrix &_action,
Solver &_solver,
std::vector<Field> _w_high_5d, std::vector<Field> _v_high_5d,
std::vector<Field> _w_high_4d, std::vector<Field> _v_high_4d,
const int _Nl, const int _Nh,
const bool _return_5d)
: evec(_evec), eval(_eval),
action(_action),
solver(_solver),
w_high_5d(_w_high_5d), v_high_5d(_v_high_5d),
w_high_4d(_w_high_4d), v_high_4d(_v_high_4d),
Nl(_Nl), Nh(_Nh),
return_5d(_return_5d){};
void high_modes(Field &source_5d, Field &w_source_5d, Field &source_4d, int i)
{
int i5d;
LOG(Message) << "A2A high modes for i = " << i << std::endl;
i5d = 0;
if (return_5d) i5d = i;
this->high_mode_v(action, solver, source_5d, v_high_5d[i5d], v_high_4d[i]);
this->high_mode_w(w_source_5d, source_4d, w_high_5d[i5d], w_high_4d[i]);
}
void return_v(int i, Field &vout_5d, Field &vout_4d)
{
if (i < Nl)
{
this->low_mode_v(action, evec->at(i), eval->at(i), vout_5d, vout_4d);
}
else
{
vout_4d = v_high_4d[i - Nl];
if (!(return_5d)) i = Nl;
vout_5d = v_high_5d[i - Nl];
}
}
void return_w(int i, Field &wout_5d, Field &wout_4d)
{
if (i < Nl)
{
this->low_mode_w(action, evec->at(i), eval->at(i), wout_5d, wout_4d);
}
else
{
wout_4d = w_high_4d[i - Nl];
if (!(return_5d)) i = Nl;
wout_5d = w_high_5d[i - Nl];
}
}
void low_mode_v(Matrix &action, const Field &evec, const RealD &eval, Field &vout_5d, Field &vout_4d)
{
GridBase *grid = action.RedBlackGrid();
Field src_o(grid);
Field sol_e(grid);
Field sol_o(grid);
Field tmp(grid);
src_o = evec;
src_o.checkerboard = Odd;
pickCheckerboard(Even, sol_e, vout_5d);
pickCheckerboard(Odd, sol_o, vout_5d);
/////////////////////////////////////////////////////
// v_ie = -(1/eval_i) * MeeInv Meo MooInv evec_i
/////////////////////////////////////////////////////
action.MooeeInv(src_o, tmp);
assert(tmp.checkerboard == Odd);
action.Meooe(tmp, sol_e);
assert(sol_e.checkerboard == Even);
action.MooeeInv(sol_e, tmp);
assert(tmp.checkerboard == Even);
sol_e = (-1.0 / eval) * tmp;
assert(sol_e.checkerboard == Even);
/////////////////////////////////////////////////////
// v_io = (1/eval_i) * MooInv evec_i
/////////////////////////////////////////////////////
action.MooeeInv(src_o, tmp);
assert(tmp.checkerboard == Odd);
sol_o = (1.0 / eval) * tmp;
assert(sol_o.checkerboard == Odd);
setCheckerboard(vout_5d, sol_e);
assert(sol_e.checkerboard == Even);
setCheckerboard(vout_5d, sol_o);
assert(sol_o.checkerboard == Odd);
action.ExportPhysicalFermionSolution(vout_5d, vout_4d);
}
void low_mode_w(Matrix &action, const Field &evec, const RealD &eval, Field &wout_5d, Field &wout_4d)
{
GridBase *grid = action.RedBlackGrid();
SchurDiagTwoOperator<Matrix, Field> _HermOpEO(action);
Field src_o(grid);
Field sol_e(grid);
Field sol_o(grid);
Field tmp(grid);
GridBase *fgrid = action.Grid();
Field tmp_wout(fgrid);
src_o = evec;
src_o.checkerboard = Odd;
pickCheckerboard(Even, sol_e, tmp_wout);
pickCheckerboard(Odd, sol_o, tmp_wout);
/////////////////////////////////////////////////////
// w_ie = - MeeInvDag MoeDag Doo evec_i
/////////////////////////////////////////////////////
_HermOpEO.Mpc(src_o, tmp);
assert(tmp.checkerboard == Odd);
action.MeooeDag(tmp, sol_e);
assert(sol_e.checkerboard == Even);
action.MooeeInvDag(sol_e, tmp);
assert(tmp.checkerboard == Even);
sol_e = (-1.0) * tmp;
/////////////////////////////////////////////////////
// w_io = Doo evec_i
/////////////////////////////////////////////////////
_HermOpEO.Mpc(src_o, sol_o);
assert(sol_o.checkerboard == Odd);
setCheckerboard(tmp_wout, sol_e);
assert(sol_e.checkerboard == Even);
setCheckerboard(tmp_wout, sol_o);
assert(sol_o.checkerboard == Odd);
action.DminusDag(tmp_wout, wout_5d);
action.ExportPhysicalFermionSource(wout_5d, wout_4d);
}
void high_mode_v(Matrix &action, Solver &solver, const Field &source, Field &vout_5d, Field &vout_4d)
{
GridBase *fgrid = action.Grid();
solver(vout_5d, source); // Note: solver is solver(out, in)
action.ExportPhysicalFermionSolution(vout_5d, vout_4d);
}
void high_mode_w(const Field &w_source_5d, const Field &source_4d, Field &wout_5d, Field &wout_4d)
{
wout_5d = w_source_5d;
wout_4d = source_4d;
}
};
// TODO: A2A for coarse eigenvectors
// template <class FineField, class CoarseField, class Matrix, class Solver>
// class A2ALMSchurDiagTwoCoarse : public A2AModesSchurDiagTwo<FineField, Matrix, Solver>
// {
// private:
// const std::vector<FineField> &subspace;
// const std::vector<CoarseField> &evec_coarse;
// const std::vector<RealD> &eval_coarse;
// Matrix &action;
// public:
// A2ALMSchurDiagTwoCoarse(const std::vector<FineField> &_subspace, const std::vector<CoarseField> &_evec_coarse, const std::vector<RealD> &_eval_coarse, Matrix &_action)
// : subspace(_subspace), evec_coarse(_evec_coarse), eval_coarse(_eval_coarse), action(_action){};
// void operator()(int i, FineField &vout, FineField &wout)
// {
// FineField prom_evec(subspace[0]._grid);
// blockPromote(evec_coarse[i], prom_evec, subspace);
// this->low_mode_v(action, prom_evec, eval_coarse[i], vout);
// this->low_mode_w(action, prom_evec, eval_coarse[i], wout);
// }
// };
END_HADRONS_NAMESPACE
#endif // A2A_Vectors_hpp_

View File

@@ -0,0 +1,253 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Application.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Application.hpp>
#include <Grid/Hadrons/GeneticScheduler.hpp>
#include <Grid/Hadrons/Modules.hpp>
using namespace Grid;
using namespace QCD;
using namespace Hadrons;
#define BIG_SEP "==============="
#define SEP "---------------"
/******************************************************************************
* Application implementation *
******************************************************************************/
// constructors ////////////////////////////////////////////////////////////////
#define MACOUT(macro) macro << " (" << #macro << ")"
#define MACOUTS(macro) HADRONS_STR(macro) << " (" << #macro << ")"
Application::Application(void)
{
initLogger();
auto dim = GridDefaultLatt(), mpi = GridDefaultMpi(), loc(dim);
locVol_ = 1;
for (unsigned int d = 0; d < dim.size(); ++d)
{
loc[d] /= mpi[d];
locVol_ *= loc[d];
}
LOG(Message) << "====== HADRONS APPLICATION STARTING ======" << std::endl;
LOG(Message) << "** Dimensions" << std::endl;
LOG(Message) << "Global lattice : " << dim << std::endl;
LOG(Message) << "MPI partition : " << mpi << std::endl;
LOG(Message) << "Local lattice : " << loc << std::endl;
LOG(Message) << std::endl;
LOG(Message) << "** Default parameters (and associated C macro)" << std::endl;
LOG(Message) << "ASCII output precision : " << MACOUT(DEFAULT_ASCII_PREC) << std::endl;
LOG(Message) << "Fermion implementation : " << MACOUTS(FIMPL) << std::endl;
LOG(Message) << "z-Fermion implementation: " << MACOUTS(ZFIMPL) << std::endl;
LOG(Message) << "Scalar implementation : " << MACOUTS(SIMPL) << std::endl;
LOG(Message) << "Gauge implementation : " << MACOUTS(GIMPL) << std::endl;
LOG(Message) << "Eigenvector base size : "
<< MACOUT(HADRONS_DEFAULT_LANCZOS_NBASIS) << std::endl;
LOG(Message) << "Schur decomposition : " << MACOUTS(HADRONS_DEFAULT_SCHUR) << std::endl;
LOG(Message) << std::endl;
}
Application::Application(const Application::GlobalPar &par)
: Application()
{
setPar(par);
}
Application::Application(const std::string parameterFileName)
: Application()
{
parameterFileName_ = parameterFileName;
}
// access //////////////////////////////////////////////////////////////////////
void Application::setPar(const Application::GlobalPar &par)
{
par_ = par;
env().setSeed(strToVec<int>(par_.seed));
}
const Application::GlobalPar & Application::getPar(void)
{
return par_;
}
// execute /////////////////////////////////////////////////////////////////////
void Application::run(void)
{
if (!parameterFileName_.empty() and (vm().getNModule() == 0))
{
parseParameterFile(parameterFileName_);
}
vm().printContent();
env().printContent();
schedule();
printSchedule();
configLoop();
}
// parse parameter file ////////////////////////////////////////////////////////
class ObjectId: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(ObjectId,
std::string, name,
std::string, type);
};
void Application::parseParameterFile(const std::string parameterFileName)
{
XmlReader reader(parameterFileName);
GlobalPar par;
ObjectId id;
LOG(Message) << "Building application from '" << parameterFileName << "'..." << std::endl;
read(reader, "parameters", par);
setPar(par);
if (!push(reader, "modules"))
{
HADRONS_ERROR(Parsing, "Cannot open node 'modules' in parameter file '"
+ parameterFileName + "'");
}
if (!push(reader, "module"))
{
HADRONS_ERROR(Parsing, "Cannot open node 'modules/module' in parameter file '"
+ parameterFileName + "'");
}
do
{
read(reader, "id", id);
vm().createModule(id.name, id.type, reader);
} while (reader.nextElement("module"));
pop(reader);
pop(reader);
}
void Application::saveParameterFile(const std::string parameterFileName)
{
LOG(Message) << "Saving application to '" << parameterFileName << "'..." << std::endl;
if (env().getGrid()->IsBoss())
{
XmlWriter writer(parameterFileName);
ObjectId id;
const unsigned int nMod = vm().getNModule();
write(writer, "parameters", getPar());
push(writer, "modules");
for (unsigned int i = 0; i < nMod; ++i)
{
push(writer, "module");
id.name = vm().getModuleName(i);
id.type = vm().getModule(i)->getRegisteredName();
write(writer, "id", id);
vm().getModule(i)->saveParameters(writer, "options");
pop(writer);
}
pop(writer);
pop(writer);
}
}
// schedule computation ////////////////////////////////////////////////////////
void Application::schedule(void)
{
if (!scheduled_ and !loadedSchedule_)
{
program_ = vm().schedule(par_.genetic);
scheduled_ = true;
}
}
void Application::saveSchedule(const std::string filename)
{
LOG(Message) << "Saving current schedule to '" << filename << "'..."
<< std::endl;
if (env().getGrid()->IsBoss())
{
TextWriter writer(filename);
std::vector<std::string> program;
if (!scheduled_)
{
HADRONS_ERROR(Definition, "Computation not scheduled");
}
for (auto address: program_)
{
program.push_back(vm().getModuleName(address));
}
write(writer, "schedule", program);
}
}
void Application::loadSchedule(const std::string filename)
{
TextReader reader(filename);
std::vector<std::string> program;
LOG(Message) << "Loading schedule from '" << filename << "'..."
<< std::endl;
read(reader, "schedule", program);
program_.clear();
for (auto &name: program)
{
program_.push_back(vm().getModuleAddress(name));
}
loadedSchedule_ = true;
}
void Application::printSchedule(void)
{
if (!scheduled_)
{
HADRONS_ERROR(Definition, "Computation not scheduled");
}
auto peak = vm().memoryNeeded(program_);
LOG(Message) << "Schedule (memory needed: " << sizeString(peak) << "):"
<< std::endl;
for (unsigned int i = 0; i < program_.size(); ++i)
{
LOG(Message) << std::setw(4) << i + 1 << ": "
<< vm().getModuleName(program_[i]) << std::endl;
}
}
// loop on configurations //////////////////////////////////////////////////////
void Application::configLoop(void)
{
auto range = par_.trajCounter;
for (unsigned int t = range.start; t < range.end; t += range.step)
{
LOG(Message) << BIG_SEP << " Starting measurement for trajectory " << t
<< " " << BIG_SEP << std::endl;
vm().setTrajectory(t);
vm().executeProgram(program_);
}
LOG(Message) << BIG_SEP << " End of measurement " << BIG_SEP << std::endl;
env().freeAll();
}

View File

@@ -0,0 +1,121 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Application.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Application_hpp_
#define Hadrons_Application_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/VirtualMachine.hpp>
#include <Grid/Hadrons/Module.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Main program manager *
******************************************************************************/
class Application
{
public:
class TrajRange: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(TrajRange,
unsigned int, start,
unsigned int, end,
unsigned int, step);
};
class GlobalPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(GlobalPar,
TrajRange, trajCounter,
VirtualMachine::GeneticPar, genetic,
std::string, seed);
};
public:
// constructors
Application(void);
Application(const GlobalPar &par);
Application(const std::string parameterFileName);
// destructor
virtual ~Application(void) = default;
// access
void setPar(const GlobalPar &par);
const GlobalPar & getPar(void);
// module creation
template <typename M>
void createModule(const std::string name);
template <typename M>
void createModule(const std::string name, const typename M::Par &par);
// execute
void run(void);
// XML parameter file I/O
void parseParameterFile(const std::string parameterFileName);
void saveParameterFile(const std::string parameterFileName);
// schedule computation
void schedule(void);
void saveSchedule(const std::string filename);
void loadSchedule(const std::string filename);
void printSchedule(void);
// loop on configurations
void configLoop(void);
private:
// environment shortcut
DEFINE_ENV_ALIAS;
// virtual machine shortcut
DEFINE_VM_ALIAS;
private:
long unsigned int locVol_;
std::string parameterFileName_{""};
GlobalPar par_;
VirtualMachine::Program program_;
bool scheduled_{false}, loadedSchedule_{false};
};
/******************************************************************************
* Application template implementation *
******************************************************************************/
// module creation /////////////////////////////////////////////////////////////
template <typename M>
void Application::createModule(const std::string name)
{
vm().createModule<M>(name);
scheduled_ = false;
}
template <typename M>
void Application::createModule(const std::string name,
const typename M::Par &par)
{
vm().createModule<M>(name, par);
scheduled_ = false;
}
END_HADRONS_NAMESPACE
#endif // Hadrons_Application_hpp_

View File

@@ -0,0 +1,323 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/EigenPack.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_EigenPack_hpp_
#define Hadrons_EigenPack_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/algorithms/iterative/Deflation.h>
#include <Grid/algorithms/iterative/LocalCoherenceLanczos.h>
BEGIN_HADRONS_NAMESPACE
// Lanczos type
#ifndef HADRONS_DEFAULT_LANCZOS_NBASIS
#define HADRONS_DEFAULT_LANCZOS_NBASIS 60
#endif
template <typename F>
class EigenPack
{
public:
typedef F Field;
struct PackRecord
{
std::string operatorXml, solverXml;
};
struct VecRecord: Serializable
{
GRID_SERIALIZABLE_CLASS_MEMBERS(VecRecord,
unsigned int, index,
double, eval);
VecRecord(void): index(0), eval(0.) {}
};
public:
std::vector<RealD> eval;
std::vector<F> evec;
PackRecord record;
public:
EigenPack(void) = default;
virtual ~EigenPack(void) = default;
EigenPack(const size_t size, GridBase *grid)
{
resize(size, grid);
}
void resize(const size_t size, GridBase *grid)
{
eval.resize(size);
evec.resize(size, grid);
}
virtual void read(const std::string fileStem, const bool multiFile, const int traj = -1)
{
if (multiFile)
{
for(int k = 0; k < evec.size(); ++k)
{
basicReadSingle(evec[k], eval[k], evecFilename(fileStem, k, traj), k);
}
}
else
{
basicRead(evec, eval, evecFilename(fileStem, -1, traj), evec.size());
}
}
virtual void write(const std::string fileStem, const bool multiFile, const int traj = -1)
{
if (multiFile)
{
for(int k = 0; k < evec.size(); ++k)
{
basicWriteSingle(evecFilename(fileStem, k, traj), evec[k], eval[k], k);
}
}
else
{
basicWrite(evecFilename(fileStem, -1, traj), evec, eval, evec.size());
}
}
protected:
std::string evecFilename(const std::string stem, const int vec, const int traj)
{
std::string t = (traj < 0) ? "" : ("." + std::to_string(traj));
if (vec == -1)
{
return stem + t + ".bin";
}
else
{
return stem + t + "/v" + std::to_string(vec) + ".bin";
}
}
template <typename T>
void basicRead(std::vector<T> &evec, std::vector<double> &eval,
const std::string filename, const unsigned int size)
{
ScidacReader binReader;
binReader.open(filename);
binReader.skipPastObjectRecord(SCIDAC_FILE_XML);
for(int k = 0; k < size; ++k)
{
VecRecord vecRecord;
LOG(Message) << "Reading eigenvector " << k << std::endl;
binReader.readScidacFieldRecord(evec[k], vecRecord);
if (vecRecord.index != k)
{
HADRONS_ERROR(Io, "Eigenvector " + std::to_string(k) + " has a"
+ " wrong index (expected " + std::to_string(vecRecord.index)
+ ") in file '" + filename + "'");
}
eval[k] = vecRecord.eval;
}
binReader.close();
}
template <typename T>
void basicReadSingle(T &evec, double &eval, const std::string filename,
const unsigned int index)
{
ScidacReader binReader;
VecRecord vecRecord;
binReader.open(filename);
binReader.skipPastObjectRecord(SCIDAC_FILE_XML);
LOG(Message) << "Reading eigenvector " << index << std::endl;
binReader.readScidacFieldRecord(evec, vecRecord);
if (vecRecord.index != index)
{
HADRONS_ERROR(Io, "Eigenvector " + std::to_string(index) + " has a"
+ " wrong index (expected " + std::to_string(vecRecord.index)
+ ") in file '" + filename + "'");
}
eval = vecRecord.eval;
binReader.close();
}
template <typename T>
void basicWrite(const std::string filename, std::vector<T> &evec,
const std::vector<double> &eval, const unsigned int size)
{
ScidacWriter binWriter(evec[0]._grid->IsBoss());
XmlWriter xmlWriter("", "eigenPackPar");
makeFileDir(filename, evec[0]._grid);
xmlWriter.pushXmlString(record.operatorXml);
xmlWriter.pushXmlString(record.solverXml);
binWriter.open(filename);
binWriter.writeLimeObject(1, 1, xmlWriter, "parameters", SCIDAC_FILE_XML);
for(int k = 0; k < size; ++k)
{
VecRecord vecRecord;
vecRecord.index = k;
vecRecord.eval = eval[k];
LOG(Message) << "Writing eigenvector " << k << std::endl;
binWriter.writeScidacFieldRecord(evec[k], vecRecord, DEFAULT_ASCII_PREC);
}
binWriter.close();
}
template <typename T>
void basicWriteSingle(const std::string filename, T &evec,
const double eval, const unsigned int index)
{
ScidacWriter binWriter(evec._grid->IsBoss());
XmlWriter xmlWriter("", "eigenPackPar");
VecRecord vecRecord;
makeFileDir(filename, evec._grid);
xmlWriter.pushXmlString(record.operatorXml);
xmlWriter.pushXmlString(record.solverXml);
binWriter.open(filename);
binWriter.writeLimeObject(1, 1, xmlWriter, "parameters", SCIDAC_FILE_XML);
vecRecord.index = index;
vecRecord.eval = eval;
LOG(Message) << "Writing eigenvector " << index << std::endl;
binWriter.writeScidacFieldRecord(evec, vecRecord, DEFAULT_ASCII_PREC);
binWriter.close();
}
};
template <typename FineF, typename CoarseF>
class CoarseEigenPack: public EigenPack<FineF>
{
public:
typedef CoarseF CoarseField;
public:
std::vector<RealD> evalCoarse;
std::vector<CoarseF> evecCoarse;
public:
CoarseEigenPack(void) = default;
virtual ~CoarseEigenPack(void) = default;
CoarseEigenPack(const size_t sizeFine, const size_t sizeCoarse,
GridBase *gridFine, GridBase *gridCoarse)
{
resize(sizeFine, sizeCoarse, gridFine, gridCoarse);
}
void resize(const size_t sizeFine, const size_t sizeCoarse,
GridBase *gridFine, GridBase *gridCoarse)
{
EigenPack<FineF>::resize(sizeFine, gridFine);
evalCoarse.resize(sizeCoarse);
evecCoarse.resize(sizeCoarse, gridCoarse);
}
void readFine(const std::string fileStem, const bool multiFile, const int traj = -1)
{
if (multiFile)
{
for(int k = 0; k < this->evec.size(); ++k)
{
this->basicReadSingle(this->evec[k], this->eval[k], this->evecFilename(fileStem + "_fine", k, traj), k);
}
}
else
{
this->basicRead(this->evec, this->eval, this->evecFilename(fileStem + "_fine", -1, traj), this->evec.size());
}
}
void readCoarse(const std::string fileStem, const bool multiFile, const int traj = -1)
{
if (multiFile)
{
for(int k = 0; k < evecCoarse.size(); ++k)
{
this->basicReadSingle(evecCoarse[k], evalCoarse[k], this->evecFilename(fileStem + "_coarse", k, traj), k);
}
}
else
{
this->basicRead(evecCoarse, evalCoarse, this->evecFilename(fileStem + "_coarse", -1, traj), evecCoarse.size());
}
}
virtual void read(const std::string fileStem, const bool multiFile, const int traj = -1)
{
readFine(fileStem, multiFile, traj);
readCoarse(fileStem, multiFile, traj);
}
void writeFine(const std::string fileStem, const bool multiFile, const int traj = -1)
{
if (multiFile)
{
for(int k = 0; k < this->evec.size(); ++k)
{
this->basicWriteSingle(this->evecFilename(fileStem + "_fine", k, traj), this->evec[k], this->eval[k], k);
}
}
else
{
this->basicWrite(this->evecFilename(fileStem + "_fine", -1, traj), this->evec, this->eval, this->evec.size());
}
}
void writeCoarse(const std::string fileStem, const bool multiFile, const int traj = -1)
{
if (multiFile)
{
for(int k = 0; k < evecCoarse.size(); ++k)
{
this->basicWriteSingle(this->evecFilename(fileStem + "_coarse", k, traj), evecCoarse[k], evalCoarse[k], k);
}
}
else
{
this->basicWrite(this->evecFilename(fileStem + "_coarse", -1, traj), evecCoarse, evalCoarse, evecCoarse.size());
}
}
virtual void write(const std::string fileStem, const bool multiFile, const int traj = -1)
{
writeFine(fileStem, multiFile, traj);
writeCoarse(fileStem, multiFile, traj);
}
};
template <typename FImpl>
using FermionEigenPack = EigenPack<typename FImpl::FermionField>;
template <typename FImpl, int nBasis>
using CoarseFermionEigenPack = CoarseEigenPack<
typename FImpl::FermionField,
typename LocalCoherenceLanczos<typename FImpl::SiteSpinor,
typename FImpl::SiteComplex,
nBasis>::CoarseField>;
END_HADRONS_NAMESPACE
#endif // Hadrons_EigenPack_hpp_

View File

@@ -0,0 +1,457 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Environment.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Environment.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
using namespace Grid;
using namespace QCD;
using namespace Hadrons;
#define ERROR_NO_ADDRESS(address)\
HADRONS_ERROR(Definition, "no object with address " + std::to_string(address));
/******************************************************************************
* Environment implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
Environment::Environment(void)
{
dim_ = GridDefaultLatt();
nd_ = dim_.size();
grid4d_.reset(SpaceTimeGrid::makeFourDimGrid(
dim_, GridDefaultSimd(nd_, vComplex::Nsimd()),
GridDefaultMpi()));
gridRb4d_.reset(SpaceTimeGrid::makeFourDimRedBlackGrid(grid4d_.get()));
vol_ = 1.;
for (auto d: dim_)
{
vol_ *= d;
}
rng4d_.reset(new GridParallelRNG(grid4d_.get()));
}
// grids ///////////////////////////////////////////////////////////////////////
void Environment::createGrid(const unsigned int Ls)
{
if ((Ls > 1) and (grid5d_.find(Ls) == grid5d_.end()))
{
auto g = getGrid();
grid5d_[Ls].reset(SpaceTimeGrid::makeFiveDimGrid(Ls, g));
gridRb5d_[Ls].reset(SpaceTimeGrid::makeFiveDimRedBlackGrid(Ls, g));
}
}
void Environment::createCoarseGrid(const std::vector<int> &blockSize,
const unsigned int Ls)
{
int nd = getNd();
std::vector<int> fineDim = getDim(), coarseDim;
unsigned int cLs;
auto key4d = blockSize, key5d = blockSize;
createGrid(Ls);
coarseDim.resize(nd);
for (int d = 0; d < coarseDim.size(); d++)
{
coarseDim[d] = fineDim[d]/blockSize[d];
if (coarseDim[d]*blockSize[d] != fineDim[d])
{
HADRONS_ERROR(Size, "Fine dimension " + std::to_string(d)
+ " (" + std::to_string(fineDim[d])
+ ") not divisible by coarse dimension ("
+ std::to_string(coarseDim[d]) + ")");
}
}
if (blockSize.size() > nd)
{
cLs = Ls/blockSize[nd];
if (cLs*blockSize[nd] != Ls)
{
HADRONS_ERROR(Size, "Fine Ls (" + std::to_string(Ls)
+ ") not divisible by coarse Ls ("
+ std::to_string(cLs) + ")");
}
key4d.resize(nd);
key5d.push_back(Ls);
}
gridCoarse4d_[key4d].reset(
SpaceTimeGrid::makeFourDimGrid(coarseDim,
GridDefaultSimd(nd, vComplex::Nsimd()), GridDefaultMpi()));
if (Ls > 1)
{
gridCoarse5d_[key5d].reset(
SpaceTimeGrid::makeFiveDimGrid(cLs, gridCoarse4d_[key4d].get()));
}
}
GridCartesian * Environment::getGrid(const unsigned int Ls) const
{
try
{
if (Ls == 1)
{
return grid4d_.get();
}
else
{
return grid5d_.at(Ls).get();
}
}
catch(std::out_of_range &)
{
HADRONS_ERROR(Definition, "no grid with Ls= " + std::to_string(Ls));
}
}
GridRedBlackCartesian * Environment::getRbGrid(const unsigned int Ls) const
{
try
{
if (Ls == 1)
{
return gridRb4d_.get();
}
else
{
return gridRb5d_.at(Ls).get();
}
}
catch(std::out_of_range &)
{
HADRONS_ERROR(Definition, "no red-black grid with Ls= " + std::to_string(Ls));
}
}
GridCartesian * Environment::getCoarseGrid(
const std::vector<int> &blockSize, const unsigned int Ls) const
{
auto key = blockSize;
try
{
if (Ls == 1)
{
key.resize(getNd());
return gridCoarse4d_.at(key).get();
}
else
{
key.push_back(Ls);
return gridCoarse5d_.at(key).get();
}
}
catch(std::out_of_range &)
{
HADRONS_ERROR(Definition, "no coarse grid with Ls= " + std::to_string(Ls));
}
}
unsigned int Environment::getNd(void) const
{
return nd_;
}
std::vector<int> Environment::getDim(void) const
{
return dim_;
}
int Environment::getDim(const unsigned int mu) const
{
return dim_[mu];
}
double Environment::getVolume(void) const
{
return vol_;
}
// random number generator /////////////////////////////////////////////////////
void Environment::setSeed(const std::vector<int> &seed)
{
rng4d_->SeedFixedIntegers(seed);
}
GridParallelRNG * Environment::get4dRng(void) const
{
return rng4d_.get();
}
// general memory management ///////////////////////////////////////////////////
void Environment::addObject(const std::string name, const int moduleAddress)
{
if (!hasObject(name))
{
ObjInfo info;
info.name = name;
info.module = moduleAddress;
info.data = nullptr;
object_.push_back(std::move(info));
objectAddress_[name] = static_cast<unsigned int>(object_.size() - 1);
}
else
{
HADRONS_ERROR(Definition, "object '" + name + "' already exists");
}
}
void Environment::setObjectModule(const unsigned int objAddress,
const int modAddress)
{
object_[objAddress].module = modAddress;
}
unsigned int Environment::getMaxAddress(void) const
{
return object_.size();
}
unsigned int Environment::getObjectAddress(const std::string name) const
{
if (hasObject(name))
{
return objectAddress_.at(name);
}
else
{
HADRONS_ERROR(Definition, "no object with name '" + name + "'");
}
}
std::string Environment::getObjectName(const unsigned int address) const
{
if (hasObject(address))
{
return object_[address].name;
}
else
{
ERROR_NO_ADDRESS(address);
}
}
std::string Environment::getObjectType(const unsigned int address) const
{
if (hasObject(address))
{
if (object_[address].type)
{
return typeName(object_[address].type);
}
else
{
return "<no type>";
}
}
else
{
ERROR_NO_ADDRESS(address);
}
}
std::string Environment::getObjectType(const std::string name) const
{
return getObjectType(getObjectAddress(name));
}
Environment::Size Environment::getObjectSize(const unsigned int address) const
{
if (hasObject(address))
{
return object_[address].size;
}
else
{
ERROR_NO_ADDRESS(address);
}
}
Environment::Size Environment::getObjectSize(const std::string name) const
{
return getObjectSize(getObjectAddress(name));
}
Environment::Storage Environment::getObjectStorage(const unsigned int address) const
{
if (hasObject(address))
{
return object_[address].storage;
}
else
{
ERROR_NO_ADDRESS(address);
}
}
Environment::Storage Environment::getObjectStorage(const std::string name) const
{
return getObjectStorage(getObjectAddress(name));
}
int Environment::getObjectModule(const unsigned int address) const
{
if (hasObject(address))
{
return object_[address].module;
}
else
{
ERROR_NO_ADDRESS(address);
}
}
int Environment::getObjectModule(const std::string name) const
{
return getObjectModule(getObjectAddress(name));
}
unsigned int Environment::getObjectLs(const unsigned int address) const
{
if (hasCreatedObject(address))
{
return object_[address].Ls;
}
else
{
ERROR_NO_ADDRESS(address);
}
}
unsigned int Environment::getObjectLs(const std::string name) const
{
return getObjectLs(getObjectAddress(name));
}
bool Environment::hasObject(const unsigned int address) const
{
return (address < object_.size());
}
bool Environment::hasObject(const std::string name) const
{
auto it = objectAddress_.find(name);
return ((it != objectAddress_.end()) and hasObject(it->second));
}
bool Environment::hasCreatedObject(const unsigned int address) const
{
if (hasObject(address))
{
return (object_[address].data != nullptr);
}
else
{
return false;
}
}
bool Environment::hasCreatedObject(const std::string name) const
{
if (hasObject(name))
{
return hasCreatedObject(getObjectAddress(name));
}
else
{
return false;
}
}
bool Environment::isObject5d(const unsigned int address) const
{
return (getObjectLs(address) > 1);
}
bool Environment::isObject5d(const std::string name) const
{
return (getObjectLs(name) > 1);
}
Environment::Size Environment::getTotalSize(void) const
{
Environment::Size size = 0;
for (auto &o: object_)
{
size += o.size;
}
return size;
}
void Environment::freeObject(const unsigned int address)
{
if (hasCreatedObject(address))
{
LOG(Message) << "Destroying object '" << object_[address].name
<< "'" << std::endl;
}
object_[address].size = 0;
object_[address].type = nullptr;
object_[address].data.reset(nullptr);
}
void Environment::freeObject(const std::string name)
{
freeObject(getObjectAddress(name));
}
void Environment::freeAll(void)
{
for (unsigned int i = 0; i < object_.size(); ++i)
{
freeObject(i);
}
}
void Environment::protectObjects(const bool protect)
{
protect_ = protect;
}
bool Environment::objectsProtected(void) const
{
return protect_;
}
// print environment content ///////////////////////////////////////////////////
void Environment::printContent(void) const
{
LOG(Debug) << "Objects: " << std::endl;
for (unsigned int i = 0; i < object_.size(); ++i)
{
LOG(Debug) << std::setw(4) << i << ": "
<< getObjectName(i) << " ("
<< sizeString(getObjectSize(i)) << ")" << std::endl;
}
}

View File

@@ -0,0 +1,353 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Environment.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Environment_hpp_
#define Hadrons_Environment_hpp_
#include <Grid/Hadrons/Global.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Global environment *
******************************************************************************/
class Object
{
public:
Object(void) = default;
virtual ~Object(void) = default;
};
template <typename T>
class Holder: public Object
{
public:
Holder(void) = default;
Holder(T *pt);
virtual ~Holder(void) = default;
T & get(void) const;
T * getPt(void) const;
void reset(T *pt);
private:
std::unique_ptr<T> objPt_{nullptr};
};
#define DEFINE_ENV_ALIAS \
inline Environment & env(void) const\
{\
return Environment::getInstance();\
}
class Environment
{
SINGLETON(Environment);
public:
typedef SITE_SIZE_TYPE Size;
typedef std::unique_ptr<GridCartesian> GridPt;
typedef std::unique_ptr<GridRedBlackCartesian> GridRbPt;
typedef std::unique_ptr<GridParallelRNG> RngPt;
enum class Storage {object, cache, temporary};
private:
struct ObjInfo
{
Size size{0};
Storage storage{Storage::object};
unsigned int Ls{0};
const std::type_info *type{nullptr}, *derivedType{nullptr};
std::string name;
int module{-1};
std::unique_ptr<Object> data{nullptr};
};
public:
// grids
void createGrid(const unsigned int Ls);
void createCoarseGrid(const std::vector<int> &blockSize,
const unsigned int Ls = 1);
GridCartesian * getGrid(const unsigned int Ls = 1) const;
GridRedBlackCartesian * getRbGrid(const unsigned int Ls = 1) const;
GridCartesian * getCoarseGrid(const std::vector<int> &blockSize,
const unsigned int Ls = 1) const;
std::vector<int> getDim(void) const;
int getDim(const unsigned int mu) const;
unsigned int getNd(void) const;
double getVolume(void) const;
// random number generator
void setSeed(const std::vector<int> &seed);
GridParallelRNG * get4dRng(void) const;
// general memory management
void addObject(const std::string name,
const int moduleAddress = -1);
template <typename B, typename T, typename ... Ts>
void createDerivedObject(const std::string name,
const Environment::Storage storage,
const unsigned int Ls,
Ts && ... args);
template <typename T, typename ... Ts>
void createObject(const std::string name,
const Environment::Storage storage,
const unsigned int Ls,
Ts && ... args);
void setObjectModule(const unsigned int objAddress,
const int modAddress);
template <typename B, typename T>
T * getDerivedObject(const unsigned int address) const;
template <typename B, typename T>
T * getDerivedObject(const std::string name) const;
template <typename T>
T * getObject(const unsigned int address) const;
template <typename T>
T * getObject(const std::string name) const;
unsigned int getMaxAddress(void) const;
unsigned int getObjectAddress(const std::string name) const;
std::string getObjectName(const unsigned int address) const;
std::string getObjectType(const unsigned int address) const;
std::string getObjectType(const std::string name) const;
Size getObjectSize(const unsigned int address) const;
Size getObjectSize(const std::string name) const;
Storage getObjectStorage(const unsigned int address) const;
Storage getObjectStorage(const std::string name) const;
int getObjectModule(const unsigned int address) const;
int getObjectModule(const std::string name) const;
unsigned int getObjectLs(const unsigned int address) const;
unsigned int getObjectLs(const std::string name) const;
bool hasObject(const unsigned int address) const;
bool hasObject(const std::string name) const;
bool hasCreatedObject(const unsigned int address) const;
bool hasCreatedObject(const std::string name) const;
bool isObject5d(const unsigned int address) const;
bool isObject5d(const std::string name) const;
template <typename T>
bool isObjectOfType(const unsigned int address) const;
template <typename T>
bool isObjectOfType(const std::string name) const;
Environment::Size getTotalSize(void) const;
void freeObject(const unsigned int address);
void freeObject(const std::string name);
void freeAll(void);
void protectObjects(const bool protect);
bool objectsProtected(void) const;
// print environment content
void printContent(void) const;
private:
// general
double vol_;
bool protect_{true};
// grids
std::vector<int> dim_;
GridPt grid4d_;
std::map<unsigned int, GridPt> grid5d_;
GridRbPt gridRb4d_;
std::map<unsigned int, GridRbPt> gridRb5d_;
std::map<std::vector<int>, GridPt> gridCoarse4d_;
std::map<std::vector<int>, GridPt> gridCoarse5d_;
unsigned int nd_;
// random number generator
RngPt rng4d_;
// object store
std::vector<ObjInfo> object_;
std::map<std::string, unsigned int> objectAddress_;
};
/******************************************************************************
* Holder template implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename T>
Holder<T>::Holder(T *pt)
: objPt_(pt)
{}
// access //////////////////////////////////////////////////////////////////////
template <typename T>
T & Holder<T>::get(void) const
{
return *objPt_.get();
}
template <typename T>
T * Holder<T>::getPt(void) const
{
return objPt_.get();
}
template <typename T>
void Holder<T>::reset(T *pt)
{
objPt_.reset(pt);
}
/******************************************************************************
* Environment template implementation *
******************************************************************************/
// general memory management ///////////////////////////////////////////////////
template <typename B, typename T, typename ... Ts>
void Environment::createDerivedObject(const std::string name,
const Environment::Storage storage,
const unsigned int Ls,
Ts && ... args)
{
if (!hasObject(name))
{
addObject(name);
}
unsigned int address = getObjectAddress(name);
if (!object_[address].data or !objectsProtected())
{
MemoryStats memStats;
if (!MemoryProfiler::stats)
{
MemoryProfiler::stats = &memStats;
}
size_t initMem = MemoryProfiler::stats->currentlyAllocated;
object_[address].storage = storage;
object_[address].Ls = Ls;
object_[address].data.reset(new Holder<B>(new T(std::forward<Ts>(args)...)));
object_[address].size = MemoryProfiler::stats->maxAllocated - initMem;
object_[address].type = &typeid(B);
object_[address].derivedType = &typeid(T);
if (MemoryProfiler::stats == &memStats)
{
MemoryProfiler::stats = nullptr;
}
}
// object already exists, no error if it is a cache, error otherwise
else if ((object_[address].storage != Storage::cache) or
(object_[address].storage != storage) or
(object_[address].name != name) or
(object_[address].type != &typeid(B)) or
(object_[address].derivedType != &typeid(T)))
{
HADRONS_ERROR(Definition, "object '" + name + "' already allocated");
}
}
template <typename T, typename ... Ts>
void Environment::createObject(const std::string name,
const Environment::Storage storage,
const unsigned int Ls,
Ts && ... args)
{
createDerivedObject<T, T>(name, storage, Ls, std::forward<Ts>(args)...);
}
template <typename B, typename T>
T * Environment::getDerivedObject(const unsigned int address) const
{
if (hasObject(address))
{
if (hasCreatedObject(address))
{
if (auto h = dynamic_cast<Holder<B> *>(object_[address].data.get()))
{
if (&typeid(T) == &typeid(B))
{
return dynamic_cast<T *>(h->getPt());
}
else
{
if (auto hder = dynamic_cast<T *>(h->getPt()))
{
return hder;
}
else
{
HADRONS_ERROR(Definition, "object with address " + std::to_string(address) +
" cannot be casted to '" + typeName(&typeid(T)) +
"' (has type '" + typeName(&typeid(h->get())) + "')");
}
}
}
else
{
HADRONS_ERROR(Definition, "object with address " + std::to_string(address) +
" does not have type '" + typeName(&typeid(B)) +
"' (has type '" + getObjectType(address) + "')");
}
}
else
{
HADRONS_ERROR(Definition, "object with address " + std::to_string(address) +
" is empty");
}
}
else
{
HADRONS_ERROR(Definition, "no object with address " + std::to_string(address));
}
}
template <typename B, typename T>
T * Environment::getDerivedObject(const std::string name) const
{
return getDerivedObject<B, T>(getObjectAddress(name));
}
template <typename T>
T * Environment::getObject(const unsigned int address) const
{
return getDerivedObject<T, T>(address);
}
template <typename T>
T * Environment::getObject(const std::string name) const
{
return getObject<T>(getObjectAddress(name));
}
template <typename T>
bool Environment::isObjectOfType(const unsigned int address) const
{
if (hasObject(address))
{
if (auto h = dynamic_cast<Holder<T> *>(object_[address].data.get()))
{
return true;
}
else
{
return false;
}
}
else
{
HADRONS_ERROR(Definition, "no object with address " + std::to_string(address));
}
}
template <typename T>
bool Environment::isObjectOfType(const std::string name) const
{
return isObjectOfType<T>(getObjectAddress(name));
}
END_HADRONS_NAMESPACE
#endif // Hadrons_Environment_hpp_

View File

@@ -0,0 +1,81 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Exceptions.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Exceptions.hpp>
#include <Grid/Hadrons/VirtualMachine.hpp>
#include <Grid/Hadrons/Module.hpp>
#ifndef ERR_SUFF
#define ERR_SUFF " (" + loc + ")"
#endif
#define CONST_EXC(name, init) \
name::name(std::string msg, std::string loc)\
:init\
{}
using namespace Grid;
using namespace Hadrons;
using namespace Exceptions;
// logic errors
CONST_EXC(Logic, logic_error(msg + ERR_SUFF))
CONST_EXC(Definition, Logic("definition error: " + msg, loc))
CONST_EXC(Implementation, Logic("implementation error: " + msg, loc))
CONST_EXC(Range, Logic("range error: " + msg, loc))
CONST_EXC(Size, Logic("size error: " + msg, loc))
// runtime errors
CONST_EXC(Runtime, runtime_error(msg + ERR_SUFF))
CONST_EXC(Argument, Runtime("argument error: " + msg, loc))
CONST_EXC(Io, Runtime("IO error: " + msg, loc))
CONST_EXC(Memory, Runtime("memory error: " + msg, loc))
CONST_EXC(Parsing, Runtime("parsing error: " + msg, loc))
CONST_EXC(Program, Runtime("program error: " + msg, loc))
CONST_EXC(System, Runtime("system error: " + msg, loc))
// abort functions
void Grid::Hadrons::Exceptions::abort(const std::exception& e)
{
auto &vm = VirtualMachine::getInstance();
int mod = vm.getCurrentModule();
LOG(Error) << "FATAL ERROR -- Exception " << typeName(&typeid(e))
<< std::endl;
if (mod >= 0)
{
LOG(Error) << "During execution of module '"
<< vm.getModuleName(mod) << "' (address " << mod << ")"
<< std::endl;
}
LOG(Error) << e.what() << std::endl;
LOG(Error) << "Aborting program" << std::endl;
Grid_finalize();
exit(EXIT_FAILURE);
}

View File

@@ -0,0 +1,75 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Exceptions.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Exceptions_hpp_
#define Hadrons_Exceptions_hpp_
#include <stdexcept>
#ifndef Hadrons_Global_hpp_
#include <Grid/Hadrons/Global.hpp>
#endif
#define HADRONS_SRC_LOC std::string(__FUNCTION__) + " at " \
+ std::string(__FILE__) + ":" + std::to_string(__LINE__)
#define HADRONS_ERROR(exc, msg)\
throw(Exceptions::exc(msg, HADRONS_SRC_LOC));
#define DECL_EXC(name, base) \
class name: public base\
{\
public:\
name(std::string msg, std::string loc);\
}
BEGIN_HADRONS_NAMESPACE
namespace Exceptions
{
// logic errors
DECL_EXC(Logic, std::logic_error);
DECL_EXC(Definition, Logic);
DECL_EXC(Implementation, Logic);
DECL_EXC(Range, Logic);
DECL_EXC(Size, Logic);
// runtime errors
DECL_EXC(Runtime, std::runtime_error);
DECL_EXC(Argument, Runtime);
DECL_EXC(Io, Runtime);
DECL_EXC(Memory, Runtime);
DECL_EXC(Parsing, Runtime);
DECL_EXC(Program, Runtime);
DECL_EXC(System, Runtime);
// abort functions
void abort(const std::exception& e);
}
END_HADRONS_NAMESPACE
#endif // Hadrons_Exceptions_hpp_

105
extras/Hadrons/Factory.hpp Normal file
View File

@@ -0,0 +1,105 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Factory.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Factory_hpp_
#define Hadrons_Factory_hpp_
#include <Grid/Hadrons/Global.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* abstract factory class *
******************************************************************************/
template <typename T>
class Factory
{
public:
typedef std::function<std::unique_ptr<T>(const std::string)> Func;
public:
// constructor
Factory(void) = default;
// destructor
virtual ~Factory(void) = default;
// registration
void registerBuilder(const std::string type, const Func &f);
// get builder list
std::vector<std::string> getBuilderList(void) const;
// factory
std::unique_ptr<T> create(const std::string type,
const std::string name) const;
private:
std::map<std::string, Func> builder_;
};
/******************************************************************************
* template implementation *
******************************************************************************/
// registration ////////////////////////////////////////////////////////////////
template <typename T>
void Factory<T>::registerBuilder(const std::string type, const Func &f)
{
builder_[type] = f;
}
// get module list /////////////////////////////////////////////////////////////
template <typename T>
std::vector<std::string> Factory<T>::getBuilderList(void) const
{
std::vector<std::string> list;
for (auto &b: builder_)
{
list.push_back(b.first);
}
return list;
}
// factory /////////////////////////////////////////////////////////////////////
template <typename T>
std::unique_ptr<T> Factory<T>::create(const std::string type,
const std::string name) const
{
Func func;
try
{
func = builder_.at(type);
}
catch (std::out_of_range &)
{
HADRONS_ERROR(Argument, "object of type '" + type + "' unknown");
}
return func(name);
}
END_HADRONS_NAMESPACE
#endif // Hadrons_Factory_hpp_

View File

@@ -0,0 +1,323 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/GeneticScheduler.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_GeneticScheduler_hpp_
#define Hadrons_GeneticScheduler_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Graph.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Scheduler based on a genetic algorithm *
******************************************************************************/
template <typename V, typename T>
class GeneticScheduler
{
public:
typedef std::vector<T> Gene;
typedef std::pair<Gene *, Gene *> GenePair;
typedef std::function<V(const Gene &)> ObjFunc;
struct Parameters
{
double mutationRate;
unsigned int popSize, seed;
};
public:
// constructor
GeneticScheduler(Graph<T> &graph, const ObjFunc &func,
const Parameters &par);
// destructor
virtual ~GeneticScheduler(void) = default;
// access
const Gene & getMinSchedule(void);
V getMinValue(void);
// reset population
void initPopulation(void);
// breed a new generation
void nextGeneration(void);
// heuristic benchmarks
void benchmarkCrossover(const unsigned int nIt);
// print population
friend std::ostream & operator<<(std::ostream &out,
const GeneticScheduler<V, T> &s)
{
out << "[";
for (auto &p: s.population_)
{
out << p.first << ", ";
}
out << "\b\b]";
return out;
}
private:
void doCrossover(void);
void doMutation(void);
// genetic operators
GenePair selectPair(void);
void crossover(Gene &c1, Gene &c2, const Gene &p1, const Gene &p2);
void mutation(Gene &m, const Gene &c);
private:
Graph<T> &graph_;
const ObjFunc &func_;
const Parameters par_;
std::multimap<V, Gene> population_;
std::mt19937 gen_;
};
/******************************************************************************
* template implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename V, typename T>
GeneticScheduler<V, T>::GeneticScheduler(Graph<T> &graph, const ObjFunc &func,
const Parameters &par)
: graph_(graph)
, func_(func)
, par_(par)
{
gen_.seed(par_.seed);
}
// access //////////////////////////////////////////////////////////////////////
template <typename V, typename T>
const typename GeneticScheduler<V, T>::Gene &
GeneticScheduler<V, T>::getMinSchedule(void)
{
return population_.begin()->second;
}
template <typename V, typename T>
V GeneticScheduler<V, T>::getMinValue(void)
{
return population_.begin()->first;
}
// breed a new generation //////////////////////////////////////////////////////
template <typename V, typename T>
void GeneticScheduler<V, T>::nextGeneration(void)
{
// random initialization of the population if necessary
if (population_.size() != par_.popSize)
{
initPopulation();
}
//LOG(Debug) << "Starting population:\n" << *this << std::endl;
// random mutations
//PARALLEL_FOR_LOOP
for (unsigned int i = 0; i < par_.popSize; ++i)
{
doMutation();
}
//LOG(Debug) << "After mutations:\n" << *this << std::endl;
// mating
//PARALLEL_FOR_LOOP
for (unsigned int i = 0; i < par_.popSize/2; ++i)
{
doCrossover();
}
//LOG(Debug) << "After mating:\n" << *this << std::endl;
// grim reaper
auto it = population_.begin();
std::advance(it, par_.popSize);
population_.erase(it, population_.end());
//LOG(Debug) << "After grim reaper:\n" << *this << std::endl;
}
// evolution steps /////////////////////////////////////////////////////////////
template <typename V, typename T>
void GeneticScheduler<V, T>::initPopulation(void)
{
population_.clear();
for (unsigned int i = 0; i < par_.popSize; ++i)
{
auto p = graph_.topoSort(gen_);
population_.insert(std::make_pair(func_(p), p));
}
}
template <typename V, typename T>
void GeneticScheduler<V, T>::doCrossover(void)
{
auto p = selectPair();
Gene &p1 = *(p.first), &p2 = *(p.second);
Gene c1, c2;
crossover(c1, c2, p1, p2);
PARALLEL_CRITICAL
{
population_.insert(std::make_pair(func_(c1), c1));
population_.insert(std::make_pair(func_(c2), c2));
}
}
template <typename V, typename T>
void GeneticScheduler<V, T>::doMutation(void)
{
std::uniform_real_distribution<double> mdis(0., 1.);
std::uniform_int_distribution<unsigned int> pdis(0, population_.size() - 1);
if (mdis(gen_) < par_.mutationRate)
{
Gene m;
auto it = population_.begin();
std::advance(it, pdis(gen_));
mutation(m, it->second);
PARALLEL_CRITICAL
{
population_.insert(std::make_pair(func_(m), m));
}
}
}
// genetic operators ///////////////////////////////////////////////////////////
template <typename V, typename T>
typename GeneticScheduler<V, T>::GenePair GeneticScheduler<V, T>::selectPair(void)
{
std::vector<double> prob;
unsigned int ind;
Gene *p1, *p2;
const double max = population_.rbegin()->first;
for (auto &c: population_)
{
prob.push_back(std::exp((c.first-1.)/max));
}
std::discrete_distribution<unsigned int> dis1(prob.begin(), prob.end());
auto rIt = population_.begin();
ind = dis1(gen_);
std::advance(rIt, ind);
p1 = &(rIt->second);
prob[ind] = 0.;
std::discrete_distribution<unsigned int> dis2(prob.begin(), prob.end());
rIt = population_.begin();
std::advance(rIt, dis2(gen_));
p2 = &(rIt->second);
return std::make_pair(p1, p2);
}
template <typename V, typename T>
void GeneticScheduler<V, T>::crossover(Gene &c1, Gene &c2, const Gene &p1,
const Gene &p2)
{
Gene buf;
std::uniform_int_distribution<unsigned int> dis(0, p1.size() - 1);
unsigned int cut = dis(gen_);
c1.clear();
buf = p2;
for (unsigned int i = 0; i < cut; ++i)
{
c1.push_back(p1[i]);
buf.erase(std::find(buf.begin(), buf.end(), p1[i]));
}
for (unsigned int i = 0; i < buf.size(); ++i)
{
c1.push_back(buf[i]);
}
c2.clear();
buf = p2;
for (unsigned int i = cut; i < p1.size(); ++i)
{
buf.erase(std::find(buf.begin(), buf.end(), p1[i]));
}
for (unsigned int i = 0; i < buf.size(); ++i)
{
c2.push_back(buf[i]);
}
for (unsigned int i = cut; i < p1.size(); ++i)
{
c2.push_back(p1[i]);
}
}
template <typename V, typename T>
void GeneticScheduler<V, T>::mutation(Gene &m, const Gene &c)
{
Gene buf;
std::uniform_int_distribution<unsigned int> dis(0, c.size() - 1);
unsigned int cut = dis(gen_);
Graph<T> g1 = graph_, g2 = graph_;
for (unsigned int i = 0; i < cut; ++i)
{
g1.removeVertex(c[i]);
}
for (unsigned int i = cut; i < c.size(); ++i)
{
g2.removeVertex(c[i]);
}
if (g1.size() > 0)
{
buf = g1.topoSort(gen_);
}
if (g2.size() > 0)
{
m = g2.topoSort(gen_);
}
for (unsigned int i = cut; i < c.size(); ++i)
{
m.push_back(buf[i - cut]);
}
}
template <typename V, typename T>
void GeneticScheduler<V, T>::benchmarkCrossover(const unsigned int nIt)
{
Gene p1, p2, c1, c2;
double neg = 0., eq = 0., pos = 0., total;
int improvement;
LOG(Message) << "Benchmarking crossover..." << std::endl;
for (unsigned int i = 0; i < nIt; ++i)
{
p1 = graph_.topoSort(gen_);
p2 = graph_.topoSort(gen_);
crossover(c1, c2, p1, p2);
improvement = (func_(c1) + func_(c2) - func_(p1) - func_(p2))/2;
if (improvement < 0) neg++; else if (improvement == 0) eq++; else pos++;
}
total = neg + eq + pos;
LOG(Message) << " -: " << neg/total << " =: " << eq/total
<< " +: " << pos/total << std::endl;
}
END_HADRONS_NAMESPACE
#endif // Hadrons_GeneticScheduler_hpp_

175
extras/Hadrons/Global.cc Normal file
View File

@@ -0,0 +1,175 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Global.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Global.hpp>
using namespace Grid;
using namespace QCD;
using namespace Hadrons;
HadronsLogger Hadrons::HadronsLogError(1,"Error");
HadronsLogger Hadrons::HadronsLogWarning(1,"Warning");
HadronsLogger Hadrons::HadronsLogMessage(1,"Message");
HadronsLogger Hadrons::HadronsLogIterative(1,"Iterative");
HadronsLogger Hadrons::HadronsLogDebug(1,"Debug");
HadronsLogger Hadrons::HadronsLogIRL(1,"IRL");
void Hadrons::initLogger(void)
{
auto w = std::string("Hadrons").length();
int cw = 8;
GridLogError.setTopWidth(w);
GridLogWarning.setTopWidth(w);
GridLogMessage.setTopWidth(w);
GridLogIterative.setTopWidth(w);
GridLogDebug.setTopWidth(w);
GridLogIRL.setTopWidth(w);
GridLogError.setChanWidth(cw);
GridLogWarning.setChanWidth(cw);
GridLogMessage.setChanWidth(cw);
GridLogIterative.setChanWidth(cw);
GridLogDebug.setChanWidth(cw);
GridLogIRL.setChanWidth(cw);
HadronsLogError.Active(true);
HadronsLogWarning.Active(true);
HadronsLogMessage.Active(GridLogMessage.isActive());
HadronsLogIterative.Active(GridLogIterative.isActive());
HadronsLogDebug.Active(GridLogDebug.isActive());
HadronsLogIRL.Active(GridLogIRL.isActive());
HadronsLogError.setChanWidth(cw);
HadronsLogWarning.setChanWidth(cw);
HadronsLogMessage.setChanWidth(cw);
HadronsLogIterative.setChanWidth(cw);
HadronsLogDebug.setChanWidth(cw);
HadronsLogIRL.setChanWidth(cw);
}
// type utilities //////////////////////////////////////////////////////////////
constexpr unsigned int maxNameSize = 1024u;
std::string Hadrons::typeName(const std::type_info *info)
{
char *buf;
std::string name;
buf = abi::__cxa_demangle(info->name(), nullptr, nullptr, nullptr);
name = buf;
free(buf);
return name;
}
// default writers/readers /////////////////////////////////////////////////////
#ifdef HAVE_HDF5
const std::string Hadrons::resultFileExt = "h5";
#else
const std::string Hadrons::resultFileExt = "xml";
#endif
// recursive mkdir /////////////////////////////////////////////////////////////
int Hadrons::mkdir(const std::string dirName)
{
if (!dirName.empty() and access(dirName.c_str(), R_OK|W_OK|X_OK))
{
mode_t mode755;
char tmp[MAX_PATH_LENGTH];
char *p = NULL;
size_t len;
mode755 = S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH;
snprintf(tmp, sizeof(tmp), "%s", dirName.c_str());
len = strlen(tmp);
if(tmp[len - 1] == '/')
{
tmp[len - 1] = 0;
}
for(p = tmp + 1; *p; p++)
{
if(*p == '/')
{
*p = 0;
::mkdir(tmp, mode755);
*p = '/';
}
}
return ::mkdir(tmp, mode755);
}
else
{
return 0;
}
}
std::string Hadrons::basename(const std::string &s)
{
constexpr char sep = '/';
size_t i = s.rfind(sep, s.length());
if (i != std::string::npos)
{
return s.substr(i+1, s.length() - i);
}
else
{
return s;
}
}
std::string Hadrons::dirname(const std::string &s)
{
constexpr char sep = '/';
size_t i = s.rfind(sep, s.length());
if (i != std::string::npos)
{
return s.substr(0, i);
}
else
{
return "";
}
}
void Hadrons::makeFileDir(const std::string filename, GridBase *g)
{
if (g->IsBoss())
{
std::string dir = dirname(filename);
int status = mkdir(dir);
if (status)
{
HADRONS_ERROR(Io, "cannot create directory '" + dir
+ "' ( " + std::strerror(errno) + ")");
}
}
}

218
extras/Hadrons/Global.hpp Normal file
View File

@@ -0,0 +1,218 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Global.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Global_hpp_
#define Hadrons_Global_hpp_
#include <set>
#include <stack>
#include <Grid/Grid.h>
#include <cxxabi.h>
#ifndef SITE_SIZE_TYPE
#define SITE_SIZE_TYPE size_t
#endif
#ifndef DEFAULT_ASCII_PREC
#define DEFAULT_ASCII_PREC 16
#endif
/* the 'using Grid::operator<<;' statement prevents a very nasty compilation
* error with GCC 5 (clang & GCC 6 compile fine without it).
*/
#define BEGIN_HADRONS_NAMESPACE \
namespace Grid {\
using namespace QCD;\
namespace Hadrons {\
using Grid::operator<<;\
using Grid::operator>>;
#define END_HADRONS_NAMESPACE }}
#define BEGIN_MODULE_NAMESPACE(name)\
namespace name {\
using Grid::operator<<;\
using Grid::operator>>;
#define END_MODULE_NAMESPACE }
#ifndef FIMPL
#define FIMPL WilsonImplR
#endif
#ifndef ZFIMPL
#define ZFIMPL ZWilsonImplR
#endif
#ifndef SIMPL
#define SIMPL ScalarImplCR
#endif
#ifndef GIMPL
#define GIMPL PeriodicGimplR
#endif
BEGIN_HADRONS_NAMESPACE
// type aliases
#define FERM_TYPE_ALIASES(FImpl, suffix)\
typedef FermionOperator<FImpl> FMat##suffix; \
typedef typename FImpl::FermionField FermionField##suffix; \
typedef typename FImpl::PropagatorField PropagatorField##suffix; \
typedef typename FImpl::SitePropagator::scalar_object SitePropagator##suffix; \
typedef std::vector<SitePropagator##suffix> SlicedPropagator##suffix;
#define GAUGE_TYPE_ALIASES(FImpl, suffix)\
typedef typename FImpl::DoubledGaugeField DoubledGaugeField##suffix;
#define SCALAR_TYPE_ALIASES(SImpl, suffix)\
typedef typename SImpl::Field ScalarField##suffix;\
typedef typename SImpl::Field PropagatorField##suffix;
#define SOLVER_TYPE_ALIASES(FImpl, suffix)\
typedef Solver<FImpl> Solver##suffix;
#define SINK_TYPE_ALIASES(suffix)\
typedef std::function<SlicedPropagator##suffix\
(const PropagatorField##suffix &)> SinkFn##suffix;
#define FG_TYPE_ALIASES(FImpl, suffix)\
FERM_TYPE_ALIASES(FImpl, suffix)\
GAUGE_TYPE_ALIASES(FImpl, suffix)
// logger
class HadronsLogger: public Logger
{
public:
HadronsLogger(int on, std::string nm): Logger("Hadrons", on, nm,
GridLogColours, "BLACK"){};
};
#define LOG(channel) std::cout << HadronsLog##channel
#define HADRONS_DEBUG_VAR(var) LOG(Debug) << #var << "= " << (var) << std::endl;
extern HadronsLogger HadronsLogError;
extern HadronsLogger HadronsLogWarning;
extern HadronsLogger HadronsLogMessage;
extern HadronsLogger HadronsLogIterative;
extern HadronsLogger HadronsLogDebug;
extern HadronsLogger HadronsLogIRL;
void initLogger(void);
// singleton pattern
#define SINGLETON(name)\
public:\
name(const name &e) = delete;\
void operator=(const name &e) = delete;\
static name & getInstance(void)\
{\
static name e;\
return e;\
}\
private:\
name(void);
#define SINGLETON_DEFCTOR(name)\
public:\
name(const name &e) = delete;\
void operator=(const name &e) = delete;\
static name & getInstance(void)\
{\
static name e;\
return e;\
}\
private:\
name(void) = default;
// type utilities
template <typename T>
const std::type_info * typeIdPt(const T &x)
{
return &typeid(x);
}
std::string typeName(const std::type_info *info);
template <typename T>
const std::type_info * typeIdPt(void)
{
return &typeid(T);
}
template <typename T>
std::string typeName(const T &x)
{
return typeName(typeIdPt(x));
}
template <typename T>
std::string typeName(void)
{
return typeName(typeIdPt<T>());
}
// default writers/readers
extern const std::string resultFileExt;
#ifdef HAVE_HDF5
typedef Hdf5Reader ResultReader;
typedef Hdf5Writer ResultWriter;
#else
typedef XmlReader ResultReader;
typedef XmlWriter ResultWriter;
#endif
#define RESULT_FILE_NAME(name) \
name + "." + std::to_string(vm().getTrajectory()) + "." + resultFileExt
// recursive mkdir
#define MAX_PATH_LENGTH 512u
int mkdir(const std::string dirName);
std::string basename(const std::string &s);
std::string dirname(const std::string &s);
void makeFileDir(const std::string filename, GridBase *g);
// default Schur convention
#ifndef HADRONS_DEFAULT_SCHUR
#define HADRONS_DEFAULT_SCHUR DiagTwo
#endif
#define _HADRONS_SCHUR_OP_(conv) Schur##conv##Operator
#define HADRONS_SCHUR_OP(conv) _HADRONS_SCHUR_OP_(conv)
#define HADRONS_DEFAULT_SCHUR_OP HADRONS_SCHUR_OP(HADRONS_DEFAULT_SCHUR)
#define _HADRONS_SCHUR_SOLVE_(conv) SchurRedBlack##conv##Solve
#define HADRONS_SCHUR_SOLVE(conv) _HADRONS_SCHUR_SOLVE_(conv)
#define HADRONS_DEFAULT_SCHUR_SOLVE HADRONS_SCHUR_SOLVE(HADRONS_DEFAULT_SCHUR)
// stringify macro
#define _HADRONS_STR(x) #x
#define HADRONS_STR(x) _HADRONS_STR(x)
END_HADRONS_NAMESPACE
#include <Grid/Hadrons/Exceptions.hpp>
#endif // Hadrons_Global_hpp_

759
extras/Hadrons/Graph.hpp Normal file
View File

@@ -0,0 +1,759 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Graph.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Graph_hpp_
#define Hadrons_Graph_hpp_
#include <Grid/Hadrons/Global.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Oriented graph class *
******************************************************************************/
// I/O for edges
template <typename T>
std::ostream & operator<<(std::ostream &out, const std::pair<T, T> &e)
{
out << "\"" << e.first << "\" -> \"" << e.second << "\"";
return out;
}
// main class
template <typename T>
class Graph
{
public:
typedef std::pair<T, T> Edge;
public:
// constructor
Graph(void);
// destructor
virtual ~Graph(void) = default;
// access
void addVertex(const T &value);
void addEdge(const Edge &e);
void addEdge(const T &start, const T &end);
std::vector<T> getVertices(void) const;
void removeVertex(const T &value);
void removeEdge(const Edge &e);
void removeEdge(const T &start, const T &end);
unsigned int size(void) const;
// tests
bool gotValue(const T &value) const;
// graph topological manipulations
std::vector<T> getAdjacentVertices(const T &value) const;
std::vector<T> getChildren(const T &value) const;
std::vector<T> getParents(const T &value) const;
std::vector<T> getRoots(void) const;
std::vector<Graph<T>> getConnectedComponents(void) const;
std::vector<T> topoSort(void);
template <typename Gen>
std::vector<T> topoSort(Gen &gen);
std::vector<std::vector<T>> allTopoSort(void);
// I/O
friend std::ostream & operator<<(std::ostream &out, const Graph<T> &g)
{
out << "{";
for (auto &e: g.edgeSet_)
{
out << e << ", ";
}
if (g.edgeSet_.size() != 0)
{
out << "\b\b";
}
out << "}";
return out;
}
private:
// vertex marking
void mark(const T &value, const bool doMark = true);
void markAll(const bool doMark = true);
void unmark(const T &value);
void unmarkAll(void);
bool isMarked(const T &value) const;
const T * getFirstMarked(const bool isMarked = true) const;
template <typename Gen>
const T * getRandomMarked(const bool isMarked, Gen &gen);
const T * getFirstUnmarked(void) const;
template <typename Gen>
const T * getRandomUnmarked(Gen &gen);
// prune marked/unmarked vertices
void removeMarked(const bool isMarked = true);
void removeUnmarked(void);
// depth-first search marking
void depthFirstSearch(void);
void depthFirstSearch(const T &root);
private:
std::map<T, bool> isMarked_;
std::set<Edge> edgeSet_;
};
// build depedency matrix from topological sorts
template <typename T>
std::map<T, std::map<T, bool>>
makeDependencyMatrix(const std::vector<std::vector<T>> &topSort);
/******************************************************************************
* template implementation *
******************************************************************************
* in all the following V is the number of vertex and E is the number of edge
* in the worst case E = V^2
*/
// constructor /////////////////////////////////////////////////////////////////
template <typename T>
Graph<T>::Graph(void)
{}
// access //////////////////////////////////////////////////////////////////////
// complexity: log(V)
template <typename T>
void Graph<T>::addVertex(const T &value)
{
isMarked_[value] = false;
}
// complexity: O(log(V))
template <typename T>
void Graph<T>::addEdge(const Edge &e)
{
addVertex(e.first);
addVertex(e.second);
edgeSet_.insert(e);
}
// complexity: O(log(V))
template <typename T>
void Graph<T>::addEdge(const T &start, const T &end)
{
addEdge(Edge(start, end));
}
template <typename T>
std::vector<T> Graph<T>::getVertices(void) const
{
std::vector<T> vertex;
for (auto &v: isMarked_)
{
vertex.push_back(v.first);
}
return vertex;
}
// complexity: O(V*log(V))
template <typename T>
void Graph<T>::removeVertex(const T &value)
{
// remove vertex from the mark table
auto vIt = isMarked_.find(value);
if (vIt != isMarked_.end())
{
isMarked_.erase(vIt);
}
else
{
HADRONS_ERROR(Range, "vertex does not exists");
}
// remove all edges containing the vertex
auto pred = [&value](const Edge &e)
{
return ((e.first == value) or (e.second == value));
};
auto eIt = find_if(edgeSet_.begin(), edgeSet_.end(), pred);
while (eIt != edgeSet_.end())
{
edgeSet_.erase(eIt);
eIt = find_if(edgeSet_.begin(), edgeSet_.end(), pred);
}
}
// complexity: O(log(V))
template <typename T>
void Graph<T>::removeEdge(const Edge &e)
{
auto eIt = edgeSet_.find(e);
if (eIt != edgeSet_.end())
{
edgeSet_.erase(eIt);
}
else
{
HADRONS_ERROR(Range, "edge does not exists");
}
}
// complexity: O(log(V))
template <typename T>
void Graph<T>::removeEdge(const T &start, const T &end)
{
removeEdge(Edge(start, end));
}
// complexity: O(1)
template <typename T>
unsigned int Graph<T>::size(void) const
{
return isMarked_.size();
}
// tests ///////////////////////////////////////////////////////////////////////
// complexity: O(log(V))
template <typename T>
bool Graph<T>::gotValue(const T &value) const
{
auto it = isMarked_.find(value);
if (it == isMarked_.end())
{
return false;
}
else
{
return true;
}
}
// vertex marking //////////////////////////////////////////////////////////////
// complexity: O(log(V))
template <typename T>
void Graph<T>::mark(const T &value, const bool doMark)
{
if (gotValue(value))
{
isMarked_[value] = doMark;
}
else
{
HADRONS_ERROR(Range, "vertex does not exists");
}
}
// complexity: O(V*log(V))
template <typename T>
void Graph<T>::markAll(const bool doMark)
{
for (auto &v: isMarked_)
{
mark(v.first, doMark);
}
}
// complexity: O(log(V))
template <typename T>
void Graph<T>::unmark(const T &value)
{
mark(value, false);
}
// complexity: O(V*log(V))
template <typename T>
void Graph<T>::unmarkAll(void)
{
markAll(false);
}
// complexity: O(log(V))
template <typename T>
bool Graph<T>::isMarked(const T &value) const
{
if (gotValue(value))
{
return isMarked_.at(value);
}
else
{
HADRONS_ERROR(Range, "vertex does not exists");
return false;
}
}
// complexity: O(log(V))
template <typename T>
const T * Graph<T>::getFirstMarked(const bool isMarked) const
{
auto pred = [&isMarked](const std::pair<T, bool> &v)
{
return (v.second == isMarked);
};
auto vIt = std::find_if(isMarked_.begin(), isMarked_.end(), pred);
if (vIt != isMarked_.end())
{
return &(vIt->first);
}
else
{
return nullptr;
}
}
// complexity: O(log(V))
template <typename T>
template <typename Gen>
const T * Graph<T>::getRandomMarked(const bool isMarked, Gen &gen)
{
auto pred = [&isMarked](const std::pair<T, bool> &v)
{
return (v.second == isMarked);
};
std::uniform_int_distribution<unsigned int> dis(0, size() - 1);
auto rIt = isMarked_.begin();
std::advance(rIt, dis(gen));
auto vIt = std::find_if(rIt, isMarked_.end(), pred);
if (vIt != isMarked_.end())
{
return &(vIt->first);
}
else
{
vIt = std::find_if(isMarked_.begin(), rIt, pred);
if (vIt != rIt)
{
return &(vIt->first);
}
else
{
return nullptr;
}
}
}
// complexity: O(log(V))
template <typename T>
const T * Graph<T>::getFirstUnmarked(void) const
{
return getFirstMarked(false);
}
// complexity: O(log(V))
template <typename T>
template <typename Gen>
const T * Graph<T>::getRandomUnmarked(Gen &gen)
{
return getRandomMarked(false, gen);
}
// prune marked/unmarked vertices //////////////////////////////////////////////
// complexity: O(V^2*log(V))
template <typename T>
void Graph<T>::removeMarked(const bool isMarked)
{
auto isMarkedCopy = isMarked_;
for (auto &v: isMarkedCopy)
{
if (v.second == isMarked)
{
removeVertex(v.first);
}
}
}
// complexity: O(V^2*log(V))
template <typename T>
void Graph<T>::removeUnmarked(void)
{
removeMarked(false);
}
// depth-first search marking //////////////////////////////////////////////////
// complexity: O(V*log(V))
template <typename T>
void Graph<T>::depthFirstSearch(void)
{
depthFirstSearch(isMarked_.begin()->first);
}
// complexity: O(V*log(V))
template <typename T>
void Graph<T>::depthFirstSearch(const T &root)
{
std::vector<T> adjacentVertex;
mark(root);
adjacentVertex = getAdjacentVertices(root);
for (auto &v: adjacentVertex)
{
if (!isMarked(v))
{
depthFirstSearch(v);
}
}
}
// graph topological manipulations /////////////////////////////////////////////
// complexity: O(V*log(V))
template <typename T>
std::vector<T> Graph<T>::getAdjacentVertices(const T &value) const
{
std::vector<T> adjacentVertex;
auto pred = [&value](const Edge &e)
{
return ((e.first == value) or (e.second == value));
};
auto eIt = std::find_if(edgeSet_.begin(), edgeSet_.end(), pred);
while (eIt != edgeSet_.end())
{
if (eIt->first == value)
{
adjacentVertex.push_back((*eIt).second);
}
else if (eIt->second == value)
{
adjacentVertex.push_back((*eIt).first);
}
eIt = std::find_if(++eIt, edgeSet_.end(), pred);
}
return adjacentVertex;
}
// complexity: O(V*log(V))
template <typename T>
std::vector<T> Graph<T>::getChildren(const T &value) const
{
std::vector<T> child;
auto pred = [&value](const Edge &e)
{
return (e.first == value);
};
auto eIt = std::find_if(edgeSet_.begin(), edgeSet_.end(), pred);
while (eIt != edgeSet_.end())
{
child.push_back((*eIt).second);
eIt = std::find_if(++eIt, edgeSet_.end(), pred);
}
return child;
}
// complexity: O(V*log(V))
template <typename T>
std::vector<T> Graph<T>::getParents(const T &value) const
{
std::vector<T> parent;
auto pred = [&value](const Edge &e)
{
return (e.second == value);
};
auto eIt = std::find_if(edgeSet_.begin(), edgeSet_.end(), pred);
while (eIt != edgeSet_.end())
{
parent.push_back((*eIt).first);
eIt = std::find_if(++eIt, edgeSet_.end(), pred);
}
return parent;
}
// complexity: O(V^2*log(V))
template <typename T>
std::vector<T> Graph<T>::getRoots(void) const
{
std::vector<T> root;
for (auto &v: isMarked_)
{
auto parent = getParents(v.first);
if (parent.size() == 0)
{
root.push_back(v.first);
}
}
return root;
}
// complexity: O(V^2*log(V))
template <typename T>
std::vector<Graph<T>> Graph<T>::getConnectedComponents(void) const
{
std::vector<Graph<T>> res;
Graph<T> copy(*this);
while (copy.size() > 0)
{
copy.depthFirstSearch();
res.push_back(copy);
res.back().removeUnmarked();
res.back().unmarkAll();
copy.removeMarked();
copy.unmarkAll();
}
return res;
}
// topological sort using a directed DFS algorithm
// complexity: O(V*log(V))
template <typename T>
std::vector<T> Graph<T>::topoSort(void)
{
std::stack<T> buf;
std::vector<T> res;
const T *vPt;
std::map<T, bool> tmpMarked(isMarked_);
// visit function
std::function<void(const T &)> visit = [&](const T &v)
{
if (tmpMarked.at(v))
{
HADRONS_ERROR(Range, "cannot topologically sort a cyclic graph");
}
if (!isMarked(v))
{
std::vector<T> child = getChildren(v);
tmpMarked[v] = true;
for (auto &c: child)
{
visit(c);
}
mark(v);
tmpMarked[v] = false;
buf.push(v);
}
};
// reset temporary marks
for (auto &v: tmpMarked)
{
tmpMarked.at(v.first) = false;
}
// loop on unmarked vertices
unmarkAll();
vPt = getFirstUnmarked();
while (vPt)
{
visit(*vPt);
vPt = getFirstUnmarked();
}
unmarkAll();
// create result vector
while (!buf.empty())
{
res.push_back(buf.top());
buf.pop();
}
return res;
}
// random version of the topological sort
// complexity: O(V*log(V))
template <typename T>
template <typename Gen>
std::vector<T> Graph<T>::topoSort(Gen &gen)
{
std::stack<T> buf;
std::vector<T> res;
const T *vPt;
std::map<T, bool> tmpMarked(isMarked_);
// visit function
std::function<void(const T &)> visit = [&](const T &v)
{
if (tmpMarked.at(v))
{
HADRONS_ERROR(Range, "cannot topologically sort a cyclic graph");
}
if (!isMarked(v))
{
std::vector<T> child = getChildren(v);
tmpMarked[v] = true;
std::shuffle(child.begin(), child.end(), gen);
for (auto &c: child)
{
visit(c);
}
mark(v);
tmpMarked[v] = false;
buf.push(v);
}
};
// reset temporary marks
for (auto &v: tmpMarked)
{
tmpMarked.at(v.first) = false;
}
// loop on unmarked vertices
unmarkAll();
vPt = getRandomUnmarked(gen);
while (vPt)
{
visit(*vPt);
vPt = getRandomUnmarked(gen);
}
unmarkAll();
// create result vector
while (!buf.empty())
{
res.push_back(buf.top());
buf.pop();
}
return res;
}
// generate all possible topological sorts
// Y. L. Varol & D. Rotem, Comput. J. 24(1), pp. 8384, 1981
// http://comjnl.oupjournals.org/cgi/doi/10.1093/comjnl/24.1.83
// complexity: O(V*log(V)) (from the paper, but really ?)
template <typename T>
std::vector<std::vector<T>> Graph<T>::allTopoSort(void)
{
std::vector<std::vector<T>> res;
std::map<T, std::map<T, bool>> iMat;
// create incidence matrix
for (auto &v1: isMarked_)
for (auto &v2: isMarked_)
{
iMat[v1.first][v2.first] = false;
}
for (auto &v: isMarked_)
{
auto cVec = getChildren(v.first);
for (auto &c: cVec)
{
iMat[v.first][c] = true;
}
}
// generate initial topological sort
res.push_back(topoSort());
// generate all other topological sorts by permutation
std::vector<T> p = res[0];
const unsigned int n = size();
std::vector<unsigned int> loc(n);
unsigned int i, k, k1;
T obj_k, obj_k1;
bool isFinal;
for (unsigned int j = 0; j < n; ++j)
{
loc[j] = j;
}
i = 0;
while (i < n-1)
{
k = loc[i];
k1 = k + 1;
obj_k = p[k];
if (k1 >= n)
{
isFinal = true;
obj_k1 = obj_k;
}
else
{
isFinal = false;
obj_k1 = p[k1];
}
if (iMat[res[0][i]][obj_k1] or isFinal)
{
for (unsigned int l = k; l >= i + 1; --l)
{
p[l] = p[l-1];
}
p[i] = obj_k;
loc[i] = i;
i++;
}
else
{
p[k] = obj_k1;
p[k1] = obj_k;
loc[i] = k1;
i = 0;
res.push_back(p);
}
}
return res;
}
// build depedency matrix from topological sorts ///////////////////////////////
// complexity: something like O(V^2*log(V!))
template <typename T>
std::map<T, std::map<T, bool>>
makeDependencyMatrix(const std::vector<std::vector<T>> &topSort)
{
std::map<T, std::map<T, bool>> m;
const std::vector<T> &vList = topSort[0];
for (auto &v1: vList)
for (auto &v2: vList)
{
bool dep = true;
for (auto &t: topSort)
{
auto i1 = std::find(t.begin(), t.end(), v1);
auto i2 = std::find(t.begin(), t.end(), v2);
dep = dep and (i1 - i2 > 0);
if (!dep) break;
}
m[v1][v2] = dep;
}
return m;
}
END_HADRONS_NAMESPACE
#endif // Hadrons_Graph_hpp_

View File

@@ -0,0 +1,80 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/HadronsXmlRun.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Application.hpp>
using namespace Grid;
using namespace QCD;
using namespace Hadrons;
int main(int argc, char *argv[])
{
// parse command line
std::string parameterFileName, scheduleFileName = "";
if (argc < 2)
{
std::cerr << "usage: " << argv[0] << " <parameter file> [<precomputed schedule>] [Grid options]";
std::cerr << std::endl;
std::exit(EXIT_FAILURE);
}
parameterFileName = argv[1];
if (argc > 2)
{
if (argv[2][0] != '-')
{
scheduleFileName = argv[2];
}
}
// initialization
Grid_init(&argc, &argv);
// execution
try
{
Application application(parameterFileName);
application.parseParameterFile(parameterFileName);
if (!scheduleFileName.empty())
{
application.loadSchedule(scheduleFileName);
}
application.run();
}
catch (const std::exception& e)
{
Exceptions::abort(e);
}
// epilogue
LOG(Message) << "Grid is finalizing now" << std::endl;
Grid_finalize();
return EXIT_SUCCESS;
}

View File

@@ -0,0 +1,34 @@
lib_LIBRARIES = libHadrons.a
bin_PROGRAMS = HadronsXmlRun
include modules.inc
libHadrons_a_SOURCES = \
$(modules_cc) \
Application.cc \
Environment.cc \
Exceptions.cc \
Global.cc \
Module.cc \
VirtualMachine.cc
libHadrons_adir = $(pkgincludedir)/Hadrons
nobase_libHadrons_a_HEADERS = \
$(modules_hpp) \
AllToAllVectors.hpp \
AllToAllReduction.hpp \
Application.hpp \
EigenPack.hpp \
Environment.hpp \
Exceptions.hpp \
Factory.hpp \
GeneticScheduler.hpp \
Global.hpp \
Graph.hpp \
Module.hpp \
Modules.hpp \
ModuleFactory.hpp \
Solver.hpp \
VirtualMachine.hpp
HadronsXmlRun_SOURCES = HadronsXmlRun.cc
HadronsXmlRun_LDADD = libHadrons.a -lGrid

61
extras/Hadrons/Module.cc Normal file
View File

@@ -0,0 +1,61 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Module.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Module.hpp>
using namespace Grid;
using namespace QCD;
using namespace Hadrons;
/******************************************************************************
* ModuleBase implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
ModuleBase::ModuleBase(const std::string name)
: name_(name)
{}
// access //////////////////////////////////////////////////////////////////////
std::string ModuleBase::getName(void) const
{
return name_;
}
// get factory registration name if available
std::string ModuleBase::getRegisteredName(void)
{
HADRONS_ERROR(Definition, "module '" + getName() + "' has no registered type"
+ " in the factory");
}
// execution ///////////////////////////////////////////////////////////////////
void ModuleBase::operator()(void)
{
setup();
execute();
}

262
extras/Hadrons/Module.hpp Normal file
View File

@@ -0,0 +1,262 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Module.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_Module_hpp_
#define Hadrons_Module_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/VirtualMachine.hpp>
BEGIN_HADRONS_NAMESPACE
// module registration macros
#define MODULE_REGISTER(mod, base, ns)\
class mod: public base\
{\
public:\
typedef base Base;\
using Base::Base;\
virtual std::string getRegisteredName(void)\
{\
return std::string(#ns "::" #mod);\
}\
};\
class ns##mod##ModuleRegistrar\
{\
public:\
ns##mod##ModuleRegistrar(void)\
{\
ModuleFactory &modFac = ModuleFactory::getInstance();\
modFac.registerBuilder(#ns "::" #mod, [&](const std::string name)\
{\
return std::unique_ptr<ns::mod>(new ns::mod(name));\
});\
}\
};\
static ns##mod##ModuleRegistrar ns##mod##ModuleRegistrarInstance;
#define MODULE_REGISTER_TMP(mod, base, ns)\
extern template class base;\
MODULE_REGISTER(mod, ARG(base), ns);
#define ARG(...) __VA_ARGS__
#define MACRO_REDIRECT(arg1, arg2, arg3, macro, ...) macro
#define envGet(type, name)\
*env().template getObject<type>(name)
#define envGetDerived(base, type, name)\
*env().template getDerivedObject<base, type>(name)
#define envGetTmp(type, var)\
type &var = *env().template getObject<type>(getName() + "_tmp_" + #var)
#define envHasType(type, name)\
env().template isObjectOfType<type>(name)
#define envCreate(type, name, Ls, ...)\
env().template createObject<type>(name, Environment::Storage::object, Ls, __VA_ARGS__)
#define envCreateDerived(base, type, name, Ls, ...)\
env().template createDerivedObject<base, type>(name, Environment::Storage::object, Ls, __VA_ARGS__)
#define envCreateLat4(type, name)\
envCreate(type, name, 1, env().getGrid())
#define envCreateLat5(type, name, Ls)\
envCreate(type, name, Ls, env().getGrid(Ls))
#define envCreateLat(...)\
MACRO_REDIRECT(__VA_ARGS__, envCreateLat5, envCreateLat4)(__VA_ARGS__)
#define envCache(type, name, Ls, ...)\
env().template createObject<type>(name, Environment::Storage::cache, Ls, __VA_ARGS__)
#define envCacheLat4(type, name)\
envCache(type, name, 1, env().getGrid())
#define envCacheLat5(type, name, Ls)\
envCache(type, name, Ls, env().getGrid(Ls))
#define envCacheLat(...)\
MACRO_REDIRECT(__VA_ARGS__, envCacheLat5, envCacheLat4)(__VA_ARGS__)
#define envTmp(type, name, Ls, ...)\
env().template createObject<type>(getName() + "_tmp_" + name, \
Environment::Storage::temporary, Ls, __VA_ARGS__)
#define envTmpLat4(type, name)\
envTmp(type, name, 1, env().getGrid())
#define envTmpLat5(type, name, Ls)\
envTmp(type, name, Ls, env().getGrid(Ls))
#define envTmpLat(...)\
MACRO_REDIRECT(__VA_ARGS__, envTmpLat5, envTmpLat4)(__VA_ARGS__)
#define saveResult(ioStem, name, result)\
if (env().getGrid()->IsBoss() and !ioStem.empty())\
{\
makeFileDir(ioStem, env().getGrid());\
{\
ResultWriter _writer(RESULT_FILE_NAME(ioStem));\
write(_writer, name, result);\
}\
}
/******************************************************************************
* Module class *
******************************************************************************/
// base class
class ModuleBase
{
public:
// constructor
ModuleBase(const std::string name);
// destructor
virtual ~ModuleBase(void) = default;
// access
std::string getName(void) const;
// get factory registration name if available
virtual std::string getRegisteredName(void);
// dependencies/products
virtual std::vector<std::string> getInput(void) = 0;
virtual std::vector<std::string> getReference(void)
{
return std::vector<std::string>(0);
};
virtual std::vector<std::string> getOutput(void) = 0;
// parse parameters
virtual void parseParameters(XmlReader &reader, const std::string name) = 0;
virtual void saveParameters(XmlWriter &writer, const std::string name) = 0;
// parameter string
virtual std::string parString(void) const = 0;
// setup
virtual void setup(void) {};
virtual void execute(void) = 0;
// execution
void operator()(void);
protected:
// environment shortcut
DEFINE_ENV_ALIAS;
// virtual machine shortcut
DEFINE_VM_ALIAS;
private:
std::string name_;
};
// derived class, templating the parameter class
template <typename P>
class Module: public ModuleBase
{
public:
typedef P Par;
public:
// constructor
Module(const std::string name);
// destructor
virtual ~Module(void) = default;
// parse parameters
virtual void parseParameters(XmlReader &reader, const std::string name);
virtual void saveParameters(XmlWriter &writer, const std::string name);
// parameter string
virtual std::string parString(void) const;
// parameter access
const P & par(void) const;
void setPar(const P &par);
private:
P par_;
};
// no parameter type
class NoPar {};
template <>
class Module<NoPar>: public ModuleBase
{
public:
// constructor
Module(const std::string name): ModuleBase(name) {};
// destructor
virtual ~Module(void) = default;
// parse parameters (do nothing)
virtual void parseParameters(XmlReader &reader, const std::string name) {};
virtual void saveParameters(XmlWriter &writer, const std::string name)
{
push(writer, "options");
pop(writer);
};
// parameter string (empty)
virtual std::string parString(void) const {return "";};
};
/******************************************************************************
* Template implementation *
******************************************************************************/
template <typename P>
Module<P>::Module(const std::string name)
: ModuleBase(name)
{}
template <typename P>
void Module<P>::parseParameters(XmlReader &reader, const std::string name)
{
read(reader, name, par_);
}
template <typename P>
void Module<P>::saveParameters(XmlWriter &writer, const std::string name)
{
write(writer, name, par_);
}
template <typename P>
std::string Module<P>::parString(void) const
{
XmlWriter writer("", "");
write(writer, par_.SerialisableClassName(), par_);
return writer.string();
}
template <typename P>
const P & Module<P>::par(void) const
{
return par_;
}
template <typename P>
void Module<P>::setPar(const P &par)
{
par_ = par;
}
END_HADRONS_NAMESPACE
#endif // Hadrons_Module_hpp_

View File

@@ -0,0 +1,48 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/ModuleFactory.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_ModuleFactory_hpp_
#define Hadrons_ModuleFactory_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Factory.hpp>
#include <Grid/Hadrons/Module.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* ModuleFactory *
******************************************************************************/
class ModuleFactory: public Factory<ModuleBase>
{
SINGLETON_DEFCTOR(ModuleFactory)
};
END_HADRONS_NAMESPACE
#endif // Hadrons_ModuleFactory_hpp_

View File

@@ -0,0 +1,62 @@
#include <Grid/Hadrons/Modules/MScalarSUN/TrKinetic.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/TimeMomProbe.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/StochFreeField.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/TwoPointNPR.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/Grad.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/TransProj.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/Div.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/TrMag.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/ShiftProbe.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/Utils.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/EMT.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/TwoPoint.hpp>
#include <Grid/Hadrons/Modules/MScalarSUN/TrPhi.hpp>
#include <Grid/Hadrons/Modules/MScalar/FreeProp.hpp>
#include <Grid/Hadrons/Modules/MScalar/Scalar.hpp>
#include <Grid/Hadrons/Modules/MScalar/ScalarVP.hpp>
#include <Grid/Hadrons/Modules/MScalar/ChargedProp.hpp>
#include <Grid/Hadrons/Modules/MScalar/VPCounterTerms.hpp>
#include <Grid/Hadrons/Modules/MLoop/NoiseLoop.hpp>
#include <Grid/Hadrons/Modules/MIO/LoadEigenPack.hpp>
#include <Grid/Hadrons/Modules/MIO/LoadCoarseEigenPack.hpp>
#include <Grid/Hadrons/Modules/MIO/LoadBinary.hpp>
#include <Grid/Hadrons/Modules/MIO/LoadNersc.hpp>
#include <Grid/Hadrons/Modules/MSink/Smear.hpp>
#include <Grid/Hadrons/Modules/MSink/Point.hpp>
#include <Grid/Hadrons/Modules/MFermion/FreeProp.hpp>
#include <Grid/Hadrons/Modules/MFermion/GaugeProp.hpp>
#include <Grid/Hadrons/Modules/MGauge/FundtoHirep.hpp>
#include <Grid/Hadrons/Modules/MGauge/Random.hpp>
#include <Grid/Hadrons/Modules/MGauge/StoutSmearing.hpp>
#include <Grid/Hadrons/Modules/MGauge/Unit.hpp>
#include <Grid/Hadrons/Modules/MGauge/StochEm.hpp>
#include <Grid/Hadrons/Modules/MGauge/UnitEm.hpp>
#include <Grid/Hadrons/Modules/MUtilities/TestSeqGamma.hpp>
#include <Grid/Hadrons/Modules/MUtilities/TestSeqConserved.hpp>
#include <Grid/Hadrons/Modules/MSource/SeqConserved.hpp>
#include <Grid/Hadrons/Modules/MSource/Z2.hpp>
#include <Grid/Hadrons/Modules/MSource/Wall.hpp>
#include <Grid/Hadrons/Modules/MSource/SeqGamma.hpp>
#include <Grid/Hadrons/Modules/MSource/Point.hpp>
#include <Grid/Hadrons/Modules/MContraction/MesonFieldGamma.hpp>
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonianEye.hpp>
#include <Grid/Hadrons/Modules/MContraction/Baryon.hpp>
#include <Grid/Hadrons/Modules/MContraction/A2APionField.hpp>
#include <Grid/Hadrons/Modules/MContraction/Meson.hpp>
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonian.hpp>
#include <Grid/Hadrons/Modules/MContraction/WeakNeutral4ptDisc.hpp>
#include <Grid/Hadrons/Modules/MContraction/Gamma3pt.hpp>
#include <Grid/Hadrons/Modules/MContraction/DiscLoop.hpp>
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonianNonEye.hpp>
#include <Grid/Hadrons/Modules/MContraction/A2AMeson.hpp>
#include <Grid/Hadrons/Modules/MContraction/WardIdentity.hpp>
#include <Grid/Hadrons/Modules/MContraction/A2AMesonField.hpp>
#include <Grid/Hadrons/Modules/MAction/WilsonClover.hpp>
#include <Grid/Hadrons/Modules/MAction/ScaledDWF.hpp>
#include <Grid/Hadrons/Modules/MAction/MobiusDWF.hpp>
#include <Grid/Hadrons/Modules/MAction/Wilson.hpp>
#include <Grid/Hadrons/Modules/MAction/DWF.hpp>
#include <Grid/Hadrons/Modules/MAction/ZMobiusDWF.hpp>
#include <Grid/Hadrons/Modules/MSolver/RBPrecCG.hpp>
#include <Grid/Hadrons/Modules/MSolver/LocalCoherenceLanczos.hpp>
#include <Grid/Hadrons/Modules/MSolver/A2AVectors.hpp>

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/DWF.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MAction/DWF.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MAction;
template class Grid::Hadrons::MAction::TDWF<FIMPL>;

View File

@@ -0,0 +1,136 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/DWF.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MAction_DWF_hpp_
#define Hadrons_MAction_DWF_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Domain wall quark action *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MAction)
class DWFPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(DWFPar,
std::string, gauge,
unsigned int, Ls,
double , mass,
double , M5,
std::string , boundary);
};
template <typename FImpl>
class TDWF: public Module<DWFPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TDWF(const std::string name);
// destructor
virtual ~TDWF(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
extern template class TDWF<FIMPL>;
MODULE_REGISTER_TMP(DWF, TDWF<FIMPL>, MAction);
/******************************************************************************
* DWF template implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TDWF<FImpl>::TDWF(const std::string name)
: Module<DWFPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TDWF<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename FImpl>
std::vector<std::string> TDWF<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TDWF<FImpl>::setup(void)
{
LOG(Message) << "Setting up domain wall fermion matrix with m= "
<< par().mass << ", M5= " << par().M5 << " and Ls= "
<< par().Ls << " using gauge field '" << par().gauge << "'"
<< std::endl;
LOG(Message) << "Fermion boundary conditions: " << par().boundary
<< std::endl;
env().createGrid(par().Ls);
auto &U = envGet(LatticeGaugeField, par().gauge);
auto &g4 = *env().getGrid();
auto &grb4 = *env().getRbGrid();
auto &g5 = *env().getGrid(par().Ls);
auto &grb5 = *env().getRbGrid(par().Ls);
std::vector<Complex> boundary = strToVec<Complex>(par().boundary);
typename DomainWallFermion<FImpl>::ImplParams implParams(boundary);
envCreateDerived(FMat, DomainWallFermion<FImpl>, getName(), par().Ls, U, g5,
grb5, g4, grb4, par().mass, par().M5, implParams);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TDWF<FImpl>::execute(void)
{}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MAction_DWF_hpp_

View File

@@ -0,0 +1,7 @@
#include <Grid/Hadrons/Modules/MAction/MobiusDWF.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MAction;
template class Grid::Hadrons::MAction::TMobiusDWF<FIMPL>;

View File

@@ -0,0 +1,109 @@
#ifndef Hadrons_MAction_MobiusDWF_hpp_
#define Hadrons_MAction_MobiusDWF_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Mobius domain-wall fermion action *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MAction)
class MobiusDWFPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(MobiusDWFPar,
std::string , gauge,
unsigned int, Ls,
double , mass,
double , M5,
double , b,
double , c,
std::string , boundary);
};
template <typename FImpl>
class TMobiusDWF: public Module<MobiusDWFPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TMobiusDWF(const std::string name);
// destructor
virtual ~TMobiusDWF(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(MobiusDWF, TMobiusDWF<FIMPL>, MAction);
/******************************************************************************
* TMobiusDWF implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TMobiusDWF<FImpl>::TMobiusDWF(const std::string name)
: Module<MobiusDWFPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TMobiusDWF<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename FImpl>
std::vector<std::string> TMobiusDWF<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TMobiusDWF<FImpl>::setup(void)
{
LOG(Message) << "Setting up Mobius domain wall fermion matrix with m= "
<< par().mass << ", M5= " << par().M5 << ", Ls= " << par().Ls
<< ", b= " << par().b << ", c= " << par().c
<< " using gauge field '" << par().gauge << "'"
<< std::endl;
LOG(Message) << "Fermion boundary conditions: " << par().boundary
<< std::endl;
env().createGrid(par().Ls);
auto &U = envGet(LatticeGaugeField, par().gauge);
auto &g4 = *env().getGrid();
auto &grb4 = *env().getRbGrid();
auto &g5 = *env().getGrid(par().Ls);
auto &grb5 = *env().getRbGrid(par().Ls);
std::vector<Complex> boundary = strToVec<Complex>(par().boundary);
typename MobiusFermion<FImpl>::ImplParams implParams(boundary);
envCreateDerived(FMat, MobiusFermion<FImpl>, getName(), par().Ls, U, g5,
grb5, g4, grb4, par().mass, par().M5, par().b, par().c,
implParams);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TMobiusDWF<FImpl>::execute(void)
{}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MAction_MobiusDWF_hpp_

View File

@@ -0,0 +1,7 @@
#include <Grid/Hadrons/Modules/MAction/ScaledDWF.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MAction;
template class Grid::Hadrons::MAction::TScaledDWF<FIMPL>;

View File

@@ -0,0 +1,108 @@
#ifndef Hadrons_MAction_ScaledDWF_hpp_
#define Hadrons_MAction_ScaledDWF_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Scaled domain wall fermion *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MAction)
class ScaledDWFPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(ScaledDWFPar,
std::string , gauge,
unsigned int, Ls,
double , mass,
double , M5,
double , scale,
std::string , boundary);
};
template <typename FImpl>
class TScaledDWF: public Module<ScaledDWFPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TScaledDWF(const std::string name);
// destructor
virtual ~TScaledDWF(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(ScaledDWF, TScaledDWF<FIMPL>, MAction);
/******************************************************************************
* TScaledDWF implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TScaledDWF<FImpl>::TScaledDWF(const std::string name)
: Module<ScaledDWFPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TScaledDWF<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename FImpl>
std::vector<std::string> TScaledDWF<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TScaledDWF<FImpl>::setup(void)
{
LOG(Message) << "Setting up scaled domain wall fermion matrix with m= "
<< par().mass << ", M5= " << par().M5 << ", Ls= " << par().Ls
<< ", scale= " << par().scale
<< " using gauge field '" << par().gauge << "'"
<< std::endl;
LOG(Message) << "Fermion boundary conditions: " << par().boundary
<< std::endl;
env().createGrid(par().Ls);
auto &U = envGet(LatticeGaugeField, par().gauge);
auto &g4 = *env().getGrid();
auto &grb4 = *env().getRbGrid();
auto &g5 = *env().getGrid(par().Ls);
auto &grb5 = *env().getRbGrid(par().Ls);
std::vector<Complex> boundary = strToVec<Complex>(par().boundary);
typename MobiusFermion<FImpl>::ImplParams implParams(boundary);
envCreateDerived(FMat, ScaledShamirFermion<FImpl>, getName(), par().Ls, U, g5,
grb5, g4, grb4, par().mass, par().M5, par().scale,
implParams);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TScaledDWF<FImpl>::execute(void)
{}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MAction_ScaledDWF_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/Wilson.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MAction/Wilson.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MAction;
template class Grid::Hadrons::MAction::TWilson<FIMPL>;

View File

@@ -0,0 +1,128 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/Wilson.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MAction_Wilson_hpp_
#define Hadrons_MAction_Wilson_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* TWilson quark action *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MAction)
class WilsonPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(WilsonPar,
std::string, gauge,
double , mass,
std::string, boundary);
};
template <typename FImpl>
class TWilson: public Module<WilsonPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TWilson(const std::string name);
// destructor
virtual ~TWilson(void) {};
// dependencies/products
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(Wilson, TWilson<FIMPL>, MAction);
/******************************************************************************
* TWilson template implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TWilson<FImpl>::TWilson(const std::string name)
: Module<WilsonPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TWilson<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename FImpl>
std::vector<std::string> TWilson<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TWilson<FImpl>::setup(void)
{
LOG(Message) << "Setting up Wilson fermion matrix with m= " << par().mass
<< " using gauge field '" << par().gauge << "'" << std::endl;
LOG(Message) << "Fermion boundary conditions: " << par().boundary
<< std::endl;
auto &U = envGet(LatticeGaugeField, par().gauge);
auto &grid = *env().getGrid();
auto &gridRb = *env().getRbGrid();
std::vector<Complex> boundary = strToVec<Complex>(par().boundary);
typename WilsonFermion<FImpl>::ImplParams implParams(boundary);
envCreateDerived(FMat, WilsonFermion<FImpl>, getName(), 1, U, grid, gridRb,
par().mass, implParams);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TWilson<FImpl>::execute()
{}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_Wilson_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/WilsonClover.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MAction/WilsonClover.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MAction;
template class Grid::Hadrons::MAction::TWilsonClover<FIMPL>;

View File

@@ -0,0 +1,137 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/WilsonClover.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Guido Cossu <guido.cossu@ed.ac.uk>
Author: pretidav <david.preti@csic.es>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MAction_WilsonClover_hpp_
#define Hadrons_MAction_WilsonClover_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Wilson clover quark action *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MAction)
class WilsonCloverPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(WilsonCloverPar,
std::string, gauge,
double , mass,
double , csw_r,
double , csw_t,
WilsonAnisotropyCoefficients ,clover_anisotropy,
std::string, boundary
);
};
template <typename FImpl>
class TWilsonClover: public Module<WilsonCloverPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TWilsonClover(const std::string name);
// destructor
virtual ~TWilsonClover(void) {};
// dependencies/products
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(WilsonClover, TWilsonClover<FIMPL>, MAction);
/******************************************************************************
* TWilsonClover template implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TWilsonClover<FImpl>::TWilsonClover(const std::string name)
: Module<WilsonCloverPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TWilsonClover<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename FImpl>
std::vector<std::string> TWilsonClover<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TWilsonClover<FImpl>::setup(void)
{
LOG(Message) << "Setting up Wilson clover fermion matrix with m= " << par().mass
<< " using gauge field '" << par().gauge << "'" << std::endl;
LOG(Message) << "Fermion boundary conditions: " << par().boundary
<< std::endl;
LOG(Message) << "Clover term csw_r: " << par().csw_r
<< " csw_t: " << par().csw_t
<< std::endl;
auto &U = envGet(LatticeGaugeField, par().gauge);
auto &grid = *env().getGrid();
auto &gridRb = *env().getRbGrid();
std::vector<Complex> boundary = strToVec<Complex>(par().boundary);
typename WilsonCloverFermion<FImpl>::ImplParams implParams(boundary);
envCreateDerived(FMat, WilsonCloverFermion<FImpl>, getName(), 1, U, grid, gridRb, par().mass,
par().csw_r,
par().csw_t,
par().clover_anisotropy,
implParams);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TWilsonClover<FImpl>::execute()
{}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_WilsonClover_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/ZMobiusDWF.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MAction/ZMobiusDWF.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MAction;
template class Grid::Hadrons::MAction::TZMobiusDWF<ZFIMPL>;

View File

@@ -0,0 +1,143 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MAction/ZMobiusDWF.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MAction_ZMobiusDWF_hpp_
#define Hadrons_MAction_ZMobiusDWF_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* z-Mobius domain-wall fermion action *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MAction)
class ZMobiusDWFPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(ZMobiusDWFPar,
std::string , gauge,
unsigned int , Ls,
double , mass,
double , M5,
double , b,
double , c,
std::vector<std::complex<double>>, omega,
std::string , boundary);
};
template <typename FImpl>
class TZMobiusDWF: public Module<ZMobiusDWFPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TZMobiusDWF(const std::string name);
// destructor
virtual ~TZMobiusDWF(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(ZMobiusDWF, TZMobiusDWF<ZFIMPL>, MAction);
/******************************************************************************
* TZMobiusDWF implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TZMobiusDWF<FImpl>::TZMobiusDWF(const std::string name)
: Module<ZMobiusDWFPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TZMobiusDWF<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename FImpl>
std::vector<std::string> TZMobiusDWF<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TZMobiusDWF<FImpl>::setup(void)
{
LOG(Message) << "Setting up z-Mobius domain wall fermion matrix with m= "
<< par().mass << ", M5= " << par().M5 << ", Ls= " << par().Ls
<< ", b= " << par().b << ", c= " << par().c
<< " using gauge field '" << par().gauge << "'"
<< std::endl;
LOG(Message) << "Omegas: " << std::endl;
for (unsigned int i = 0; i < par().omega.size(); ++i)
{
LOG(Message) << " omega[" << i << "]= " << par().omega[i] << std::endl;
}
LOG(Message) << "Fermion boundary conditions: " << par().boundary
<< std::endl;
env().createGrid(par().Ls);
auto &U = envGet(LatticeGaugeField, par().gauge);
auto &g4 = *env().getGrid();
auto &grb4 = *env().getRbGrid();
auto &g5 = *env().getGrid(par().Ls);
auto &grb5 = *env().getRbGrid(par().Ls);
auto omega = par().omega;
std::vector<Complex> boundary = strToVec<Complex>(par().boundary);
typename ZMobiusFermion<FImpl>::ImplParams implParams(boundary);
envCreateDerived(FMat, ZMobiusFermion<FImpl>, getName(), par().Ls, U, g5,
grb5, g4, grb4, par().mass, par().M5, omega,
par().b, par().c, implParams);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TZMobiusDWF<FImpl>::execute(void)
{}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MAction_ZMobiusDWF_hpp_

View File

@@ -0,0 +1,8 @@
#include <Grid/Hadrons/Modules/MContraction/A2AMeson.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TA2AMeson<FIMPL>;
template class Grid::Hadrons::MContraction::TA2AMeson<ZFIMPL>;

View File

@@ -0,0 +1,207 @@
#ifndef Hadrons_MContraction_A2AMeson_hpp_
#define Hadrons_MContraction_A2AMeson_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
#include <Grid/Hadrons/AllToAllVectors.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* A2AMeson *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
typedef std::pair<Gamma::Algebra, Gamma::Algebra> GammaPair;
class A2AMesonPar : Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(A2AMesonPar,
int, Nl,
int, N,
std::string, A2A1,
std::string, A2A2,
std::string, gammas,
std::string, output);
};
template <typename FImpl>
class TA2AMeson : public Module<A2AMesonPar>
{
public:
FERM_TYPE_ALIASES(FImpl, );
SOLVER_TYPE_ALIASES(FImpl, );
typedef A2AModesSchurDiagTwo<typename FImpl::FermionField, FMat, Solver> A2ABase;
class Result : Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,
Gamma::Algebra, gamma_snk,
Gamma::Algebra, gamma_src,
std::vector<Complex>, corr);
};
public:
// constructor
TA2AMeson(const std::string name);
// destructor
virtual ~TA2AMeson(void){};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
virtual void parseGammaString(std::vector<GammaPair> &gammaList);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER(A2AMeson, ARG(TA2AMeson<FIMPL>), MContraction);
MODULE_REGISTER(ZA2AMeson, ARG(TA2AMeson<ZFIMPL>), MContraction);
/******************************************************************************
* TA2AMeson implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TA2AMeson<FImpl>::TA2AMeson(const std::string name)
: Module<A2AMesonPar>(name)
{
}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TA2AMeson<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().A2A1 + "_class", par().A2A2 + "_class"};
in.push_back(par().A2A1 + "_w_high_4d");
in.push_back(par().A2A2 + "_v_high_4d");
return in;
}
template <typename FImpl>
std::vector<std::string> TA2AMeson<FImpl>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
template <typename FImpl>
void TA2AMeson<FImpl>::parseGammaString(std::vector<GammaPair> &gammaList)
{
gammaList.clear();
// Parse individual contractions from input string.
gammaList = strToVec<GammaPair>(par().gammas);
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TA2AMeson<FImpl>::setup(void)
{
int nt = env().getDim(Tp);
int N = par().N;
int Ls_ = env().getObjectLs(par().A2A1 + "_class");
envTmp(std::vector<FermionField>, "w1", 1, N, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "v1", 1, N, FermionField(env().getGrid(1)));
envTmpLat(FermionField, "tmpv_5d", Ls_);
envTmpLat(FermionField, "tmpw_5d", Ls_);
envTmp(std::vector<ComplexD>, "MF_x", 1, nt);
envTmp(std::vector<ComplexD>, "MF_y", 1, nt);
envTmp(std::vector<ComplexD>, "tmp", 1, nt);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TA2AMeson<FImpl>::execute(void)
{
LOG(Message) << "Computing A2A meson contractions" << std::endl;
Result result;
Gamma g5(Gamma::Algebra::Gamma5);
std::vector<GammaPair> gammaList;
int nt = env().getDim(Tp);
parseGammaString(gammaList);
result.gamma_snk = gammaList[0].first;
result.gamma_src = gammaList[0].second;
result.corr.resize(nt);
int Nl = par().Nl;
int N = par().N;
LOG(Message) << "N for A2A cont: " << N << std::endl;
envGetTmp(std::vector<ComplexD>, MF_x);
envGetTmp(std::vector<ComplexD>, MF_y);
envGetTmp(std::vector<ComplexD>, tmp);
for (unsigned int t = 0; t < nt; ++t)
{
tmp[t] = TensorRemove(MF_x[t] * MF_y[t] * 0.0);
}
Gamma gSnk(gammaList[0].first);
Gamma gSrc(gammaList[0].second);
auto &a2a1_fn = envGet(A2ABase, par().A2A1 + "_class");
envGetTmp(std::vector<FermionField>, w1);
envGetTmp(std::vector<FermionField>, v1);
envGetTmp(FermionField, tmpv_5d);
envGetTmp(FermionField, tmpw_5d);
LOG(Message) << "Finding v and w vectors for N = " << N << std::endl;
for (int i = 0; i < N; i++)
{
a2a1_fn.return_v(i, tmpv_5d, v1[i]);
a2a1_fn.return_w(i, tmpw_5d, w1[i]);
}
LOG(Message) << "Found v and w vectors for N = " << N << std::endl;
for (unsigned int i = 0; i < N; i++)
{
v1[i] = gSnk * v1[i];
}
int ty;
for (unsigned int i = 0; i < N; i++)
{
for (unsigned int j = 0; j < N; j++)
{
mySliceInnerProductVector(MF_x, w1[i], v1[j], Tp);
mySliceInnerProductVector(MF_y, w1[j], v1[i], Tp);
for (unsigned int t = 0; t < nt; ++t)
{
for (unsigned int tx = 0; tx < nt; tx++)
{
ty = (tx + t) % nt;
tmp[t] += TensorRemove((MF_x[tx]) * (MF_y[ty]));
}
}
}
if (i % 10 == 0)
{
LOG(Message) << "MF for i = " << i << " of " << N << std::endl;
}
}
double NTinv = 1.0 / static_cast<double>(nt);
for (unsigned int t = 0; t < nt; ++t)
{
result.corr[t] = NTinv * tmp[t];
}
saveResult(par().output, "meson", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_A2AMeson_hpp_

View File

@@ -0,0 +1,8 @@
#include <Grid/Hadrons/Modules/MContraction/A2AMesonField.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TA2AMesonField<FIMPL>;
template class Grid::Hadrons::MContraction::TA2AMesonField<ZFIMPL>;

View File

@@ -0,0 +1,279 @@
#ifndef Hadrons_MContraction_A2AMesonField_hpp_
#define Hadrons_MContraction_A2AMesonField_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
#include <Grid/Hadrons/AllToAllVectors.hpp>
#include <Grid/Hadrons/Modules/MContraction/A2Autils.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* A2AMesonField *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
typedef std::pair<Gamma::Algebra, Gamma::Algebra> GammaPair;
class A2AMesonFieldPar : Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(A2AMesonFieldPar,
int, cacheBlock,
int, schurBlock,
int, Nmom,
int, N,
int, Nl,
std::string, A2A,
std::string, output);
};
template <typename FImpl>
class TA2AMesonField : public Module<A2AMesonFieldPar>
{
public:
FERM_TYPE_ALIASES(FImpl, );
SOLVER_TYPE_ALIASES(FImpl, );
typedef A2AModesSchurDiagTwo<typename FImpl::FermionField, FMat, Solver> A2ABase;
public:
// constructor
TA2AMesonField(const std::string name);
// destructor
virtual ~TA2AMesonField(void){};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER(A2AMesonField, ARG(TA2AMesonField<FIMPL>), MContraction);
MODULE_REGISTER(ZA2AMesonField, ARG(TA2AMesonField<ZFIMPL>), MContraction);
/******************************************************************************
* TA2AMesonField implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TA2AMesonField<FImpl>::TA2AMesonField(const std::string name)
: Module<A2AMesonFieldPar>(name)
{
}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TA2AMesonField<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().A2A + "_class"};
in.push_back(par().A2A + "_w_high_4d");
in.push_back(par().A2A + "_v_high_4d");
return in;
}
template <typename FImpl>
std::vector<std::string> TA2AMesonField<FImpl>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TA2AMesonField<FImpl>::setup(void)
{
auto &a2a = envGet(A2ABase, par().A2A + "_class");
int nt = env().getDim(Tp);
int Nl = par().Nl;
int N = par().N;
int Ls_ = env().getObjectLs(par().A2A + "_class");
// Four D fields
envTmp(std::vector<FermionField>, "w", 1, par().schurBlock, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "v", 1, par().schurBlock, FermionField(env().getGrid(1)));
// 5D tmp
envTmpLat(FermionField, "tmp_5d", Ls_);
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TA2AMesonField<FImpl>::execute(void)
{
LOG(Message) << "Computing A2A meson field" << std::endl;
auto &a2a = envGet(A2ABase, par().A2A + "_class");
// 2+6+4+4 = 16 gammas
// Ordering defined here
std::vector<Gamma::Algebra> gammas ( {
Gamma::Algebra::Gamma5,
Gamma::Algebra::Identity,
Gamma::Algebra::GammaX,
Gamma::Algebra::GammaY,
Gamma::Algebra::GammaZ,
Gamma::Algebra::GammaT,
Gamma::Algebra::GammaXGamma5,
Gamma::Algebra::GammaYGamma5,
Gamma::Algebra::GammaZGamma5,
Gamma::Algebra::GammaTGamma5,
Gamma::Algebra::SigmaXY,
Gamma::Algebra::SigmaXZ,
Gamma::Algebra::SigmaXT,
Gamma::Algebra::SigmaYZ,
Gamma::Algebra::SigmaYT,
Gamma::Algebra::SigmaZT
});
///////////////////////////////////////////////
// Square assumption for now Nl = Nr = N
///////////////////////////////////////////////
int nt = env().getDim(Tp);
int nx = env().getDim(Xp);
int ny = env().getDim(Yp);
int nz = env().getDim(Zp);
int N = par().N;
int Nl = par().Nl;
int ngamma = gammas.size();
int schurBlock = par().schurBlock;
int cacheBlock = par().cacheBlock;
int nmom = par().Nmom;
///////////////////////////////////////////////
// Momentum setup
///////////////////////////////////////////////
GridBase *grid = env().getGrid(1);
std::vector<LatticeComplex> phases(nmom,grid);
for(int m=0;m<nmom;m++){
phases[m] = Complex(1.0); // All zero momentum for now
}
Eigen::Tensor<ComplexD,5> mesonField (nmom,ngamma,nt,N,N);
LOG(Message) << "N = Nh+Nl for A2A MesonField is " << N << std::endl;
envGetTmp(std::vector<FermionField>, w);
envGetTmp(std::vector<FermionField>, v);
envGetTmp(FermionField, tmp_5d);
LOG(Message) << "Finding v and w vectors for N = " << N << std::endl;
//////////////////////////////////////////////////////////////////////////
// i,j is first loop over SchurBlock factors reusing 5D matrices
// ii,jj is second loop over cacheBlock factors for high perf contractoin
// iii,jjj are loops within cacheBlock
// Total index is sum of these i+ii+iii etc...
//////////////////////////////////////////////////////////////////////////
double flops = 0.0;
double bytes = 0.0;
double vol = nx*ny*nz*nt;
double t_schur=0;
double t_contr=0;
double t_int_0=0;
double t_int_1=0;
double t_int_2=0;
double t_int_3=0;
double t0 = usecond();
int N_i = N;
int N_j = N;
for(int i=0;i<N_i;i+=schurBlock){ //loop over SchurBlocking to suppress 5D matrix overhead
for(int j=0;j<N_j;j+=schurBlock){
///////////////////////////////////////////////////////////////
// Get the W and V vectors for this schurBlock^2 set of terms
///////////////////////////////////////////////////////////////
int N_ii = MIN(N_i-i,schurBlock);
int N_jj = MIN(N_j-j,schurBlock);
t_schur-=usecond();
for(int ii =0;ii < N_ii;ii++) a2a.return_w(i+ii, tmp_5d, w[ii]);
for(int jj =0;jj < N_jj;jj++) a2a.return_v(j+jj, tmp_5d, v[jj]);
t_schur+=usecond();
LOG(Message) << "Found w vectors " << i <<" .. " << i+N_ii-1 << std::endl;
LOG(Message) << "Found v vectors " << j <<" .. " << j+N_jj-1 << std::endl;
///////////////////////////////////////////////////////////////
// Series of cache blocked chunks of the contractions within this SchurBlock
///////////////////////////////////////////////////////////////
for(int ii=0;ii<N_ii;ii+=cacheBlock){
for(int jj=0;jj<N_jj;jj+=cacheBlock){
int N_iii = MIN(N_ii-ii,cacheBlock);
int N_jjj = MIN(N_jj-jj,cacheBlock);
Eigen::Tensor<ComplexD,5> mesonFieldBlocked(nmom,ngamma,nt,N_iii,N_jjj);
t_contr-=usecond();
A2Autils<FImpl>::MesonField(mesonFieldBlocked,
&w[ii],
&v[jj], gammas, phases,Tp);
t_contr+=usecond();
flops += vol * ( 2 * 8.0 + 6.0 + 8.0*nmom) * N_iii*N_jjj*ngamma;
bytes += vol * (12.0 * sizeof(Complex) ) * N_iii*N_jjj
+ vol * ( 2.0 * sizeof(Complex) *nmom ) * N_iii*N_jjj* ngamma;
///////////////////////////////////////////////////////////////
// Copy back to full meson field tensor
///////////////////////////////////////////////////////////////
parallel_for_nest2(int iii=0;iii< N_iii;iii++) {
for(int jjj=0;jjj< N_jjj;jjj++) {
for(int m =0;m< nmom;m++) {
for(int g =0;g< ngamma;g++) {
for(int t =0;t< nt;t++) {
mesonField(m,g,t,i+ii+iii,j+jj+jjj) = mesonFieldBlocked(m,g,t,iii,jjj);
}}}
}}
}}
}}
double nodes=grid->NodeCount();
double t1 = usecond();
LOG(Message) << " Contraction of MesonFields took "<<(t1-t0)/1.0e6<< " seconds " << std::endl;
LOG(Message) << " Schur "<<(t_schur)/1.0e6<< " seconds " << std::endl;
LOG(Message) << " Contr "<<(t_contr)/1.0e6<< " seconds " << std::endl;
/////////////////////////////////////////////////////////////////////////
// Test: Build the pion correlator (two end)
// < PI_ij(t0) PI_ji (t0+t) >
/////////////////////////////////////////////////////////////////////////
std::vector<ComplexD> corr(nt,ComplexD(0.0));
for(int i=0;i<N;i++){
for(int j=0;j<N;j++){
int m=0; // first momentum
int g=0; // first gamma in above ordering is gamma5 for pion
for(int t0=0;t0<nt;t0++){
for(int t=0;t<nt;t++){
int tt = (t0+t)%nt;
corr[t] += mesonField(m,g,t0,i,j)* mesonField(m,g,tt,j,i);
}}
}}
for(int t=0;t<nt;t++) corr[t] = corr[t]/ (double)nt;
for(int t=0;t<nt;t++) LOG(Message) << " " << t << " " << corr[t]<<std::endl;
// saveResult(par().output, "meson", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_A2AMesonField_hpp_

View File

@@ -0,0 +1,8 @@
#include <Grid/Hadrons/Modules/MContraction/A2APionField.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TA2APionField<FIMPL>;
template class Grid::Hadrons::MContraction::TA2APionField<ZFIMPL>;

View File

@@ -0,0 +1,502 @@
#ifndef Hadrons_MContraction_A2APionField_hpp_
#define Hadrons_MContraction_A2APionField_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
#include <Grid/Hadrons/AllToAllVectors.hpp>
#include <Grid/Hadrons/Modules/MContraction/A2Autils.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* A2APionField *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
typedef std::pair<Gamma::Algebra, Gamma::Algebra> GammaPair;
class A2APionFieldPar : Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(A2APionFieldPar,
int, cacheBlock,
int, schurBlock,
int, Nmom,
std::string, A2A_i,
std::string, A2A_j,
std::string, output);
};
template <typename FImpl>
class TA2APionField : public Module<A2APionFieldPar>
{
public:
FERM_TYPE_ALIASES(FImpl, );
SOLVER_TYPE_ALIASES(FImpl, );
typedef typename FImpl::SiteSpinor vobj;
typedef typename vobj::scalar_object sobj;
typedef typename vobj::scalar_type scalar_type;
typedef typename vobj::vector_type vector_type;
typedef iSpinMatrix<vector_type> SpinMatrix_v;
typedef iSpinMatrix<scalar_type> SpinMatrix_s;
typedef iSinglet<vector_type> Scalar_v;
typedef iSinglet<scalar_type> Scalar_s;
typedef A2AModesSchurDiagTwo<typename FImpl::FermionField, FMat, Solver> A2ABase;
public:
// constructor
TA2APionField(const std::string name);
// destructor
virtual ~TA2APionField(void){};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER(A2APionField, ARG(TA2APionField<FIMPL>), MContraction);
MODULE_REGISTER(ZA2APionField, ARG(TA2APionField<ZFIMPL>), MContraction);
/******************************************************************************
* TA2APionField implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TA2APionField<FImpl>::TA2APionField(const std::string name)
: Module<A2APionFieldPar>(name)
{
}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TA2APionField<FImpl>::getInput(void)
{
std::vector<std::string> in;
in.push_back(par().A2A_i + "_class");
in.push_back(par().A2A_i + "_w_high_4d");
in.push_back(par().A2A_i + "_v_high_4d");
in.push_back(par().A2A_j + "_class");
in.push_back(par().A2A_j + "_w_high_4d");
in.push_back(par().A2A_j + "_v_high_4d");
return in;
}
template <typename FImpl>
std::vector<std::string> TA2APionField<FImpl>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TA2APionField<FImpl>::setup(void)
{
// Four D fields
envTmp(std::vector<FermionField>, "wi", 1, par().schurBlock, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "vi", 1, par().schurBlock, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "wj", 1, par().schurBlock, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "vj", 1, par().schurBlock, FermionField(env().getGrid(1)));
// 5D tmp
int Ls_i = env().getObjectLs(par().A2A_i + "_class");
envTmpLat(FermionField, "tmp_5d", Ls_i);
int Ls_j= env().getObjectLs(par().A2A_j + "_class");
assert ( Ls_i == Ls_j );
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TA2APionField<FImpl>::execute(void)
{
LOG(Message) << "Computing A2A Pion fields" << std::endl;
auto &a2a_i = envGet(A2ABase, par().A2A_i + "_class");
auto &a2a_j = envGet(A2ABase, par().A2A_j + "_class");
///////////////////////////////////////////////
// Square assumption for now Nl = Nr = N
///////////////////////////////////////////////
int nt = env().getDim(Tp);
int nx = env().getDim(Xp);
int ny = env().getDim(Yp);
int nz = env().getDim(Zp);
// int N_i = a2a_i.par().N;
// int N_j = a2a_j.par().N;
int N_i = a2a_i.getN();
int N_j = a2a_j.getN();
int nmom=par().Nmom;
int schurBlock = par().schurBlock;
int cacheBlock = par().cacheBlock;
///////////////////////////////////////////////
// Momentum setup
///////////////////////////////////////////////
GridBase *grid = env().getGrid(1);
std::vector<LatticeComplex> phases(nmom,grid);
for(int m=0;m<nmom;m++){
phases[m] = Complex(1.0); // All zero momentum for now
}
///////////////////////////////////////////////////////////////////////
// i and j represent different flavours, hits, with different ranks.
// in general non-square case.
///////////////////////////////////////////////////////////////////////
Eigen::Tensor<ComplexD,4> pionFieldWVmom_ij (nmom,nt,N_i,N_j);
Eigen::Tensor<ComplexD,3> pionFieldWV_ij (nt,N_i,N_j);
Eigen::Tensor<ComplexD,4> pionFieldWVmom_ji (nmom,nt,N_j,N_i);
Eigen::Tensor<ComplexD,3> pionFieldWV_ji (nt,N_j,N_i);
LOG(Message) << "Rank for A2A PionField is " << N_i << " x "<<N_j << std::endl;
envGetTmp(std::vector<FermionField>, wi);
envGetTmp(std::vector<FermionField>, vi);
envGetTmp(std::vector<FermionField>, wj);
envGetTmp(std::vector<FermionField>, vj);
envGetTmp(FermionField, tmp_5d);
LOG(Message) << "Finding v and w vectors " << std::endl;
//////////////////////////////////////////////////////////////////////////
// i,j is first loop over SchurBlock factors reusing 5D matrices
// ii,jj is second loop over cacheBlock factors for high perf contractoin
// iii,jjj are loops within cacheBlock
// Total index is sum of these i+ii+iii etc...
//////////////////////////////////////////////////////////////////////////
double flops = 0.0;
double bytes = 0.0;
double vol = nx*ny*nz*nt;
double vol3 = nx*ny*nz;
double t_schur=0;
double t_contr_vwm=0;
double t_contr_vw=0;
double t_contr_ww=0;
double t_contr_vv=0;
double tt0 = usecond();
for(int i=0;i<N_i;i+=schurBlock){ //loop over SchurBlocking to suppress 5D matrix overhead
for(int j=0;j<N_j;j+=schurBlock){
///////////////////////////////////////////////////////////////
// Get the W and V vectors for this schurBlock^2 set of terms
///////////////////////////////////////////////////////////////
int N_ii = MIN(N_i-i,schurBlock);
int N_jj = MIN(N_j-j,schurBlock);
t_schur-=usecond();
for(int ii =0;ii < N_ii;ii++) a2a_i.return_w(i+ii, tmp_5d, wi[ii]);
for(int jj =0;jj < N_jj;jj++) a2a_j.return_w(j+jj, tmp_5d, wj[jj]);
for(int ii =0;ii < N_ii;ii++) a2a_i.return_v(i+ii, tmp_5d, vi[ii]);
for(int jj =0;jj < N_jj;jj++) a2a_j.return_v(j+jj, tmp_5d, vj[jj]);
t_schur+=usecond();
LOG(Message) << "Found i w&v vectors " << i <<" .. " << i+N_ii-1 << std::endl;
LOG(Message) << "Found j w&v vectors " << j <<" .. " << j+N_jj-1 << std::endl;
///////////////////////////////////////////////////////////////
// Series of cache blocked chunks of the contractions within this SchurBlock
///////////////////////////////////////////////////////////////
for(int ii=0;ii<N_ii;ii+=cacheBlock){
for(int jj=0;jj<N_jj;jj+=cacheBlock){
int N_iii = MIN(N_ii-ii,cacheBlock);
int N_jjj = MIN(N_jj-jj,cacheBlock);
Eigen::Tensor<ComplexD,4> pionFieldWVmomB_ij(nmom,nt,N_iii,N_jjj);
Eigen::Tensor<ComplexD,4> pionFieldWVmomB_ji(nmom,nt,N_jjj,N_iii);
Eigen::Tensor<ComplexD,3> pionFieldWVB_ij(nt,N_iii,N_jjj);
Eigen::Tensor<ComplexD,3> pionFieldWVB_ji(nt,N_jjj,N_iii);
t_contr_vwm-=usecond();
A2Autils<FImpl>::PionFieldWVmom(pionFieldWVmomB_ij, &wi[ii], &vj[jj], phases,Tp);
A2Autils<FImpl>::PionFieldWVmom(pionFieldWVmomB_ji, &wj[jj], &vi[ii], phases,Tp);
t_contr_vwm+=usecond();
t_contr_vw-=usecond();
A2Autils<FImpl>::PionFieldWV(pionFieldWVB_ij, &wi[ii], &vj[jj],Tp);
A2Autils<FImpl>::PionFieldWV(pionFieldWVB_ji, &wj[jj], &vi[ii],Tp);
t_contr_vw+=usecond();
flops += vol * ( 2 * 8.0 + 6.0 + 8.0*nmom) * N_iii*N_jjj;
bytes += vol * (12.0 * sizeof(Complex) ) * N_iii*N_jjj
+ vol * ( 2.0 * sizeof(Complex) *nmom ) * N_iii*N_jjj;
///////////////////////////////////////////////////////////////
// Copy back to full meson field tensor
///////////////////////////////////////////////////////////////
parallel_for_nest2(int iii=0;iii< N_iii;iii++) {
for(int jjj=0;jjj< N_jjj;jjj++) {
for(int m =0;m< nmom;m++) {
for(int t =0;t< nt;t++) {
pionFieldWVmom_ij(m,t,i+ii+iii,j+jj+jjj) = pionFieldWVmomB_ij(m,t,iii,jjj);
pionFieldWVmom_ji(m,t,j+jj+jjj,i+ii+iii) = pionFieldWVmomB_ji(m,t,jjj,iii);
}}
for(int t =0;t< nt;t++) {
pionFieldWV_ij(t,i+ii+iii,j+jj+jjj) = pionFieldWVB_ij(t,iii,jjj);
pionFieldWV_ji(t,j+jj+jjj,i+ii+iii) = pionFieldWVB_ji(t,jjj,iii);
}
}}
}}
}}
double nodes=grid->NodeCount();
double tt1 = usecond();
LOG(Message) << " Contraction of PionFields took "<<(tt1-tt0)/1.0e6<< " seconds " << std::endl;
LOG(Message) << " Schur "<<(t_schur)/1.0e6<< " seconds " << std::endl;
LOG(Message) << " Contr WVmom "<<(t_contr_vwm)/1.0e6<< " seconds " << std::endl;
LOG(Message) << " Contr WV "<<(t_contr_vw)/1.0e6<< " seconds " << std::endl;
double t_kernel = t_contr_vwm;
LOG(Message) << " Arith "<<flops/(t_kernel)/1.0e3/nodes<< " Gflop/s / node " << std::endl;
LOG(Message) << " Arith "<<bytes/(t_kernel)/1.0e3/nodes<< " GB/s /node " << std::endl;
/////////////////////////////////////////////////////////////////////////
// Test: Build the pion correlator (two end)
// < PI_ij(t0) PI_ji (t0+t) >
/////////////////////////////////////////////////////////////////////////
std::vector<ComplexD> corrMom(nt,ComplexD(0.0));
for(int i=0;i<N_i;i++){
for(int j=0;j<N_j;j++){
int m=0; // first momentum
for(int t0=0;t0<nt;t0++){
for(int t=0;t<nt;t++){
int tt = (t0+t)%nt;
corrMom[t] += pionFieldWVmom_ij(m,t0,i,j)* pionFieldWVmom_ji(m,tt,j,i);
}}
}}
for(int t=0;t<nt;t++) corrMom[t] = corrMom[t]/ (double)nt;
for(int t=0;t<nt;t++) LOG(Message) << " C_vwm " << t << " " << corrMom[t]<<std::endl;
/////////////////////////////////////////////////////////////////////////
// Test: Build the pion correlator (two end) from zero mom contraction
// < PI_ij(t0) PI_ji (t0+t) >
/////////////////////////////////////////////////////////////////////////
std::vector<ComplexD> corr(nt,ComplexD(0.0));
for(int i=0;i<N_i;i++){
for(int j=0;j<N_j;j++){
for(int t0=0;t0<nt;t0++){
for(int t=0;t<nt;t++){
int tt = (t0+t)%nt;
corr[t] += pionFieldWV_ij(t0,i,j)* pionFieldWV_ji(tt,j,i);
}}
}}
for(int t=0;t<nt;t++) corr[t] = corr[t]/ (double)nt;
for(int t=0;t<nt;t++) LOG(Message) << " C_vw " << t << " " << corr[t]<<std::endl;
/////////////////////////////////////////////////////////////////////////
// Test: Build the pion correlator from zero mom contraction with revers
// charge flow
/////////////////////////////////////////////////////////////////////////
std::vector<ComplexD> corr_wwvv(nt,ComplexD(0.0));
wi.resize(N_i,grid);
vi.resize(N_i,grid);
wj.resize(N_j,grid);
vj.resize(N_j,grid);
for(int i =0;i < N_i;i++) a2a_i.return_v(i, tmp_5d, vi[i]);
for(int i =0;i < N_i;i++) a2a_i.return_w(i, tmp_5d, wi[i]);
for(int j =0;j < N_j;j++) a2a_j.return_v(j, tmp_5d, vj[j]);
for(int j =0;j < N_j;j++) a2a_j.return_w(j, tmp_5d, wj[j]);
Eigen::Tensor<ComplexD,3> pionFieldWW_ij (nt,N_i,N_j);
Eigen::Tensor<ComplexD,3> pionFieldVV_ji (nt,N_j,N_i);
Eigen::Tensor<ComplexD,3> pionFieldWW_ji (nt,N_j,N_i);
Eigen::Tensor<ComplexD,3> pionFieldVV_ij (nt,N_i,N_j);
A2Autils<FImpl>::PionFieldWW(pionFieldWW_ij, &wi[0], &wj[0],Tp);
A2Autils<FImpl>::PionFieldVV(pionFieldVV_ji, &vj[0], &vi[0],Tp);
A2Autils<FImpl>::PionFieldWW(pionFieldWW_ji, &wj[0], &wi[0],Tp);
A2Autils<FImpl>::PionFieldVV(pionFieldVV_ij, &vi[0], &vj[0],Tp);
for(int i=0;i<N_i;i++){
for(int j=0;j<N_j;j++){
for(int t0=0;t0<nt;t0++){
for(int t=0;t<nt;t++){
int tt = (t0+t)%nt;
corr_wwvv[t] += pionFieldWW_ij(t0,i,j)* pionFieldVV_ji(tt,j,i);
corr_wwvv[t] += pionFieldWW_ji(t0,j,i)* pionFieldVV_ij(tt,i,j);
}}
}}
for(int t=0;t<nt;t++) corr_wwvv[t] = corr_wwvv[t] / vol /2.0 ; // (ij+ji noise contribs if i!=j ).
for(int t=0;t<nt;t++) LOG(Message) << " C_wwvv " << t << " " << corr_wwvv[t]<<std::endl;
/////////////////////////////////////////////////////////////////////////
// This is only correct if there are NO low modes
// Use the "ii" case to construct possible Z wall source one end trick
/////////////////////////////////////////////////////////////////////////
std::vector<ComplexD> corr_z2(nt,ComplexD(0.0));
Eigen::Tensor<ComplexD,3> pionFieldWW (nt,N_i,N_i);
Eigen::Tensor<ComplexD,3> pionFieldVV (nt,N_i,N_i);
A2Autils<FImpl>::PionFieldWW(pionFieldWW, &wi[0], &wi[0],Tp);
A2Autils<FImpl>::PionFieldVV(pionFieldVV, &vi[0], &vi[0],Tp);
for(int i=0;i<N_i;i++){
for(int t0=0;t0<nt;t0++){
for(int t=0;t<nt;t++){
int tt = (t0+t)%nt;
corr_z2[t] += pionFieldWW(t0,i,i) * pionFieldVV(tt,i,i) /vol ;
}}
}
LOG(Message) << " C_z2 WARNING only correct if Nl == 0 "<<std::endl;
for(int t=0;t<nt;t++) LOG(Message) << " C_z2 " << t << " " << corr_z2[t]<<std::endl;
/////////////////////////////////////////////////////////////////////////
// Test: Build a bag contraction
/////////////////////////////////////////////////////////////////////////
Eigen::Tensor<ComplexD,2> DeltaF2_fig8 (nt,16);
Eigen::Tensor<ComplexD,2> DeltaF2_trtr (nt,16);
Eigen::Tensor<ComplexD,1> denom0 (nt);
Eigen::Tensor<ComplexD,1> denom1 (nt);
const int dT=16;
A2Autils<FImpl>::DeltaFeq2 (dT,dT,DeltaF2_fig8,DeltaF2_trtr,
denom0,denom1,
pionFieldWW_ij,&vi[0],&vj[0],Tp);
{
int g=0; // O_{VV+AA}
for(int t=0;t<nt;t++)
LOG(Message) << " Bag [" << t << ","<<g<<"] "
<< (DeltaF2_fig8(t,g)+DeltaF2_trtr(t,g))
/ ( 8.0/3.0 * denom0[t]*denom1[t])
<<std::endl;
}
/////////////////////////////////////////////////////////////////////////
// Test: Build a bag contraction the Z2 way
// Build a wall bag comparison assuming no low modes
/////////////////////////////////////////////////////////////////////////
LOG(Message) << " Bag_z2 WARNING only correct if Nl == 0 "<<std::endl;
int t0=0;
int t1=dT;
int Nl=0;
LatticePropagator Qd0(grid);
LatticePropagator Qd1(grid);
LatticePropagator Qs0(grid);
LatticePropagator Qs1(grid);
for(int s=0;s<4;s++){
for(int c=0;c<3;c++){
int idx0 = Nl+t0*12+s*3+c;
int idx1 = Nl+t1*12+s*3+c;
FermToProp<FImpl>(Qd0, vi[idx0], s, c);
FermToProp<FImpl>(Qd1, vi[idx1], s, c);
FermToProp<FImpl>(Qs0, vj[idx0], s, c);
FermToProp<FImpl>(Qs1, vj[idx1], s, c);
}
}
std::vector<Gamma::Algebra> gammas ( {
Gamma::Algebra::GammaX,
Gamma::Algebra::GammaY,
Gamma::Algebra::GammaZ,
Gamma::Algebra::GammaT,
Gamma::Algebra::GammaXGamma5,
Gamma::Algebra::GammaYGamma5,
Gamma::Algebra::GammaZGamma5,
Gamma::Algebra::GammaTGamma5,
Gamma::Algebra::Identity,
Gamma::Algebra::Gamma5,
Gamma::Algebra::SigmaXY,
Gamma::Algebra::SigmaXZ,
Gamma::Algebra::SigmaXT,
Gamma::Algebra::SigmaYZ,
Gamma::Algebra::SigmaYT,
Gamma::Algebra::SigmaZT
});
auto G5 = Gamma::Algebra::Gamma5;
LatticePropagator anti_d0 = adj( Gamma(G5) * Qd0 * Gamma(G5));
LatticePropagator anti_d1 = adj( Gamma(G5) * Qd1 * Gamma(G5));
LatticeComplex TR1(grid);
LatticeComplex TR2(grid);
LatticeComplex Wick1(grid);
LatticeComplex Wick2(grid);
LatticePropagator PR1(grid);
LatticePropagator PR2(grid);
PR1 = Qs0 * Gamma(G5) * anti_d0;
PR2 = Qs1 * Gamma(G5) * anti_d1;
for(int g=0;g<Nd*Nd;g++){
auto g1 = gammas[g];
Gamma G1 (g1);
TR1 = trace( PR1 * G1 );
TR2 = trace( PR2 * G1 );
Wick1 = TR1*TR2;
Wick2 = trace( PR1* G1 * PR2 * G1 );
std::vector<TComplex> C1;
std::vector<TComplex> C2;
std::vector<TComplex> C3;
sliceSum(Wick1,C1, Tp);
sliceSum(Wick2,C2, Tp);
sliceSum(TR1 ,C3, Tp);
/*
if(g<5){
for(int t=0;t<C1.size();t++){
LOG(Message) << " Wick1["<<g<<","<<t<< "] "<< C1[t]<<std::endl;
}
for(int t=0;t<C2.size();t++){
LOG(Message) << " Wick2["<<g<<","<<t<< "] "<< C2[t]<<std::endl;
}
}
if( (g==9) || (g==7) ){ // P and At in above ordering
for(int t=0;t<C3.size();t++){
LOG(Message) << " <G|P>["<<g<<","<<t<< "] "<< C3[t]<<std::endl;
}
}
*/
}
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_A2APionField_hpp_

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/Baryon.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/Baryon.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TBaryon<FIMPL,FIMPL,FIMPL>;

View File

@@ -0,0 +1,140 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/Baryon.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_Baryon_hpp_
#define Hadrons_MContraction_Baryon_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Baryon *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
class BaryonPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(BaryonPar,
std::string, q1,
std::string, q2,
std::string, q3,
std::string, output);
};
template <typename FImpl1, typename FImpl2, typename FImpl3>
class TBaryon: public Module<BaryonPar>
{
public:
FERM_TYPE_ALIASES(FImpl1, 1);
FERM_TYPE_ALIASES(FImpl2, 2);
FERM_TYPE_ALIASES(FImpl3, 3);
class Result: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,
std::vector<std::vector<std::vector<Complex>>>, corr);
};
public:
// constructor
TBaryon(const std::string name);
// destructor
virtual ~TBaryon(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(Baryon, ARG(TBaryon<FIMPL, FIMPL, FIMPL>), MContraction);
/******************************************************************************
* TBaryon implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
TBaryon<FImpl1, FImpl2, FImpl3>::TBaryon(const std::string name)
: Module<BaryonPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
std::vector<std::string> TBaryon<FImpl1, FImpl2, FImpl3>::getInput(void)
{
std::vector<std::string> input = {par().q1, par().q2, par().q3};
return input;
}
template <typename FImpl1, typename FImpl2, typename FImpl3>
std::vector<std::string> TBaryon<FImpl1, FImpl2, FImpl3>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
void TBaryon<FImpl1, FImpl2, FImpl3>::setup(void)
{
envTmpLat(LatticeComplex, "c");
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
void TBaryon<FImpl1, FImpl2, FImpl3>::execute(void)
{
LOG(Message) << "Computing baryon contractions '" << getName() << "' using"
<< " quarks '" << par().q1 << "', '" << par().q2 << "', and '"
<< par().q3 << "'" << std::endl;
auto &q1 = envGet(PropagatorField1, par().q1);
auto &q2 = envGet(PropagatorField2, par().q2);
auto &q3 = envGet(PropagatorField3, par().q2);
envGetTmp(LatticeComplex, c);
Result result;
// FIXME: do contractions
// saveResult(par().output, "meson", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_Baryon_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/DiscLoop.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/DiscLoop.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TDiscLoop<FIMPL>;

View File

@@ -0,0 +1,143 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/DiscLoop.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_DiscLoop_hpp_
#define Hadrons_MContraction_DiscLoop_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* DiscLoop *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
class DiscLoopPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(DiscLoopPar,
std::string, q_loop,
Gamma::Algebra, gamma,
std::string, output);
};
template <typename FImpl>
class TDiscLoop: public Module<DiscLoopPar>
{
FERM_TYPE_ALIASES(FImpl,);
class Result: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,
Gamma::Algebra, gamma,
std::vector<Complex>, corr);
};
public:
// constructor
TDiscLoop(const std::string name);
// destructor
virtual ~TDiscLoop(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(DiscLoop, TDiscLoop<FIMPL>, MContraction);
/******************************************************************************
* TDiscLoop implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TDiscLoop<FImpl>::TDiscLoop(const std::string name)
: Module<DiscLoopPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TDiscLoop<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().q_loop};
return in;
}
template <typename FImpl>
std::vector<std::string> TDiscLoop<FImpl>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TDiscLoop<FImpl>::setup(void)
{
envTmpLat(LatticeComplex, "c");
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TDiscLoop<FImpl>::execute(void)
{
LOG(Message) << "Computing disconnected loop contraction '" << getName()
<< "' using '" << par().q_loop << "' with " << par().gamma
<< " insertion." << std::endl;
auto &q_loop = envGet(PropagatorField, par().q_loop);
Gamma gamma(par().gamma);
std::vector<TComplex> buf;
Result result;
envGetTmp(LatticeComplex, c);
c = trace(gamma*q_loop);
sliceSum(c, buf, Tp);
result.gamma = par().gamma;
result.corr.resize(buf.size());
for (unsigned int t = 0; t < buf.size(); ++t)
{
result.corr[t] = TensorRemove(buf[t]);
}
saveResult(par().output, "disc", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_DiscLoop_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/Gamma3pt.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/Gamma3pt.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TGamma3pt<FIMPL,FIMPL,FIMPL>;

View File

@@ -0,0 +1,184 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/Gamma3pt.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_Gamma3pt_hpp_
#define Hadrons_MContraction_Gamma3pt_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/*
* 3pt contraction with gamma matrix insertion.
*
* Schematic:
*
* q2 q3
* /----<------*------<----¬
* / gamma \
* / \
* i * * f
* \ /
* \ /
* \----------->----------/
* q1
*
* trace(g5*q1*adj(q2)*g5*gamma*q3)
*
* options:
* - q1: sink smeared propagator, source at i
* - q2: propagator, source at i
* - q3: propagator, source at f
* - gamma: gamma matrix to insert
* - tSnk: sink position for propagator q1.
*
*/
/******************************************************************************
* Gamma3pt *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
class Gamma3ptPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Gamma3ptPar,
std::string, q1,
std::string, q2,
std::string, q3,
Gamma::Algebra, gamma,
unsigned int, tSnk,
std::string, output);
};
template <typename FImpl1, typename FImpl2, typename FImpl3>
class TGamma3pt: public Module<Gamma3ptPar>
{
FERM_TYPE_ALIASES(FImpl1, 1);
FERM_TYPE_ALIASES(FImpl2, 2);
FERM_TYPE_ALIASES(FImpl3, 3);
class Result: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,
Gamma::Algebra, gamma,
std::vector<Complex>, corr);
};
public:
// constructor
TGamma3pt(const std::string name);
// destructor
virtual ~TGamma3pt(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(Gamma3pt, ARG(TGamma3pt<FIMPL, FIMPL, FIMPL>), MContraction);
/******************************************************************************
* TGamma3pt implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
TGamma3pt<FImpl1, FImpl2, FImpl3>::TGamma3pt(const std::string name)
: Module<Gamma3ptPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
std::vector<std::string> TGamma3pt<FImpl1, FImpl2, FImpl3>::getInput(void)
{
std::vector<std::string> in = {par().q1, par().q2, par().q3};
return in;
}
template <typename FImpl1, typename FImpl2, typename FImpl3>
std::vector<std::string> TGamma3pt<FImpl1, FImpl2, FImpl3>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
void TGamma3pt<FImpl1, FImpl2, FImpl3>::setup(void)
{
envTmpLat(LatticeComplex, "c");
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2, typename FImpl3>
void TGamma3pt<FImpl1, FImpl2, FImpl3>::execute(void)
{
LOG(Message) << "Computing 3pt contractions '" << getName() << "' using"
<< " quarks '" << par().q1 << "', '" << par().q2 << "' and '"
<< par().q3 << "', with " << par().gamma << " insertion."
<< std::endl;
// Initialise variables. q2 and q3 are normal propagators, q1 may be
// sink smeared.
auto &q1 = envGet(SlicedPropagator1, par().q1);
auto &q2 = envGet(PropagatorField2, par().q2);
auto &q3 = envGet(PropagatorField2, par().q3);
Gamma g5(Gamma::Algebra::Gamma5);
Gamma gamma(par().gamma);
std::vector<TComplex> buf;
Result result;
// Extract relevant timeslice of sinked propagator q1, then contract &
// sum over all spacial positions of gamma insertion.
SitePropagator1 q1Snk = q1[par().tSnk];
envGetTmp(LatticeComplex, c);
c = trace(g5*q1Snk*adj(q2)*(g5*gamma)*q3);
sliceSum(c, buf, Tp);
result.gamma = par().gamma;
result.corr.resize(buf.size());
for (unsigned int t = 0; t < buf.size(); ++t)
{
result.corr[t] = TensorRemove(buf[t]);
}
saveResult(par().output, "gamma3pt", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_Gamma3pt_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/Meson.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/Meson.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TMeson<FIMPL,FIMPL>;

View File

@@ -0,0 +1,248 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/Meson.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_Meson_hpp_
#define Hadrons_MContraction_Meson_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/*
Meson contractions
-----------------------------
* options:
- q1: input propagator 1 (string)
- q2: input propagator 2 (string)
- gammas: gamma products to insert at sink & source, pairs of gamma matrices
(space-separated strings) in round brackets (i.e. (g_sink g_src)),
in a sequence (e.g. "(Gamma5 Gamma5)(Gamma5 GammaT)").
Special values: "all" - perform all possible contractions.
- sink: module to compute the sink to use in contraction (string).
*/
/******************************************************************************
* TMeson *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
typedef std::pair<Gamma::Algebra, Gamma::Algebra> GammaPair;
class MesonPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(MesonPar,
std::string, q1,
std::string, q2,
std::string, gammas,
std::string, sink,
std::string, output);
};
template <typename FImpl1, typename FImpl2>
class TMeson: public Module<MesonPar>
{
public:
FERM_TYPE_ALIASES(FImpl1, 1);
FERM_TYPE_ALIASES(FImpl2, 2);
FERM_TYPE_ALIASES(ScalarImplCR, Scalar);
SINK_TYPE_ALIASES(Scalar);
class Result: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,
Gamma::Algebra, gamma_snk,
Gamma::Algebra, gamma_src,
std::vector<Complex>, corr);
};
public:
// constructor
TMeson(const std::string name);
// destructor
virtual ~TMeson(void) {};
// dependencies/products
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
virtual void parseGammaString(std::vector<GammaPair> &gammaList);
protected:
// execution
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(Meson, ARG(TMeson<FIMPL, FIMPL>), MContraction);
/******************************************************************************
* TMeson implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2>
TMeson<FImpl1, FImpl2>::TMeson(const std::string name)
: Module<MesonPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2>
std::vector<std::string> TMeson<FImpl1, FImpl2>::getInput(void)
{
std::vector<std::string> input = {par().q1, par().q2, par().sink};
return input;
}
template <typename FImpl1, typename FImpl2>
std::vector<std::string> TMeson<FImpl1, FImpl2>::getOutput(void)
{
std::vector<std::string> output = {};
return output;
}
template <typename FImpl1, typename FImpl2>
void TMeson<FImpl1, FImpl2>::parseGammaString(std::vector<GammaPair> &gammaList)
{
gammaList.clear();
// Determine gamma matrices to insert at source/sink.
if (par().gammas.compare("all") == 0)
{
// Do all contractions.
for (unsigned int i = 1; i < Gamma::nGamma; i += 2)
{
for (unsigned int j = 1; j < Gamma::nGamma; j += 2)
{
gammaList.push_back(std::make_pair((Gamma::Algebra)i,
(Gamma::Algebra)j));
}
}
}
else
{
// Parse individual contractions from input string.
gammaList = strToVec<GammaPair>(par().gammas);
}
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl1, typename FImpl2>
void TMeson<FImpl1, FImpl2>::setup(void)
{
envTmpLat(LatticeComplex, "c");
}
// execution ///////////////////////////////////////////////////////////////////
#define mesonConnected(q1, q2, gSnk, gSrc) \
(g5*(gSnk))*(q1)*(adj(gSrc)*g5)*adj(q2)
template <typename FImpl1, typename FImpl2>
void TMeson<FImpl1, FImpl2>::execute(void)
{
LOG(Message) << "Computing meson contractions '" << getName() << "' using"
<< " quarks '" << par().q1 << "' and '" << par().q2 << "'"
<< std::endl;
std::vector<TComplex> buf;
std::vector<Result> result;
Gamma g5(Gamma::Algebra::Gamma5);
std::vector<GammaPair> gammaList;
int nt = env().getDim(Tp);
parseGammaString(gammaList);
result.resize(gammaList.size());
for (unsigned int i = 0; i < result.size(); ++i)
{
result[i].gamma_snk = gammaList[i].first;
result[i].gamma_src = gammaList[i].second;
result[i].corr.resize(nt);
}
if (envHasType(SlicedPropagator1, par().q1) and
envHasType(SlicedPropagator2, par().q2))
{
auto &q1 = envGet(SlicedPropagator1, par().q1);
auto &q2 = envGet(SlicedPropagator2, par().q2);
LOG(Message) << "(propagator already sinked)" << std::endl;
for (unsigned int i = 0; i < result.size(); ++i)
{
Gamma gSnk(gammaList[i].first);
Gamma gSrc(gammaList[i].second);
for (unsigned int t = 0; t < buf.size(); ++t)
{
result[i].corr[t] = TensorRemove(trace(mesonConnected(q1[t], q2[t], gSnk, gSrc)));
}
}
}
else
{
auto &q1 = envGet(PropagatorField1, par().q1);
auto &q2 = envGet(PropagatorField2, par().q2);
envGetTmp(LatticeComplex, c);
LOG(Message) << "(using sink '" << par().sink << "')" << std::endl;
for (unsigned int i = 0; i < result.size(); ++i)
{
Gamma gSnk(gammaList[i].first);
Gamma gSrc(gammaList[i].second);
std::string ns;
ns = vm().getModuleNamespace(env().getObjectModule(par().sink));
if (ns == "MSource")
{
PropagatorField1 &sink = envGet(PropagatorField1, par().sink);
c = trace(mesonConnected(q1, q2, gSnk, gSrc)*sink);
sliceSum(c, buf, Tp);
}
else if (ns == "MSink")
{
SinkFnScalar &sink = envGet(SinkFnScalar, par().sink);
c = trace(mesonConnected(q1, q2, gSnk, gSrc));
buf = sink(c);
}
for (unsigned int t = 0; t < buf.size(); ++t)
{
result[i].corr[t] = TensorRemove(buf[t]);
}
}
}
saveResult(par().output, "meson", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_Meson_hpp_

View File

@@ -0,0 +1,8 @@
#include <Grid/Hadrons/Modules/MContraction/MesonFieldGamma.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TMesonFieldGamma<FIMPL>;
template class Grid::Hadrons::MContraction::TMesonFieldGamma<ZFIMPL>;

View File

@@ -0,0 +1,269 @@
#ifndef Hadrons_MContraction_MesonFieldGamma_hpp_
#define Hadrons_MContraction_MesonFieldGamma_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
#include <Grid/Hadrons/AllToAllVectors.hpp>
#include <Grid/Hadrons/AllToAllReduction.hpp>
#include <Grid/Grid_Eigen_Dense.h>
#include <fstream>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* MesonFieldGamma *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
class MesonFieldPar : Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(MesonFieldPar,
int, Nl,
int, N,
int, Nblock,
std::string, A2A1,
std::string, A2A2,
std::string, gammas,
std::string, output);
};
template <typename FImpl>
class TMesonFieldGamma : public Module<MesonFieldPar>
{
public:
FERM_TYPE_ALIASES(FImpl, );
SOLVER_TYPE_ALIASES(FImpl, );
typedef A2AModesSchurDiagTwo<typename FImpl::FermionField, FMat, Solver> A2ABase;
class Result : Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,
Gamma::Algebra, gamma,
std::vector<std::vector<std::vector<ComplexD>>>, MesonField);
};
public:
// constructor
TMesonFieldGamma(const std::string name);
// destructor
virtual ~TMesonFieldGamma(void){};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
virtual void parseGammaString(std::vector<Gamma::Algebra> &gammaList);
virtual void vectorOfWs(std::vector<FermionField> &w, int i, int Nblock, FermionField &tmpw_5d, std::vector<FermionField> &vec_w);
virtual void vectorOfVs(std::vector<FermionField> &v, int j, int Nblock, FermionField &tmpv_5d, std::vector<FermionField> &vec_v);
virtual void gammaMult(std::vector<FermionField> &v, Gamma gamma);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER(MesonFieldGamma, ARG(TMesonFieldGamma<FIMPL>), MContraction);
MODULE_REGISTER(ZMesonFieldGamma, ARG(TMesonFieldGamma<ZFIMPL>), MContraction);
/******************************************************************************
* TMesonFieldGamma implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TMesonFieldGamma<FImpl>::TMesonFieldGamma(const std::string name)
: Module<MesonFieldPar>(name)
{
}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TMesonFieldGamma<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().A2A1 + "_class", par().A2A2 + "_class"};
in.push_back(par().A2A1 + "_w_high_4d");
in.push_back(par().A2A2 + "_v_high_4d");
return in;
}
template <typename FImpl>
std::vector<std::string> TMesonFieldGamma<FImpl>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
template <typename FImpl>
void TMesonFieldGamma<FImpl>::parseGammaString(std::vector<Gamma::Algebra> &gammaList)
{
gammaList.clear();
// Determine gamma matrices to insert at source/sink.
if (par().gammas.compare("all") == 0)
{
// Do all contractions.
for (unsigned int i = 1; i < Gamma::nGamma; i += 2)
{
gammaList.push_back(((Gamma::Algebra)i));
}
}
else
{
// Parse individual contractions from input string.
gammaList = strToVec<Gamma::Algebra>(par().gammas);
}
}
template <typename FImpl>
void TMesonFieldGamma<FImpl>::vectorOfWs(std::vector<FermionField> &w, int i, int Nblock, FermionField &tmpw_5d, std::vector<FermionField> &vec_w)
{
for (unsigned int ni = 0; ni < Nblock; ni++)
{
vec_w[ni] = w[i + ni];
}
}
template <typename FImpl>
void TMesonFieldGamma<FImpl>::vectorOfVs(std::vector<FermionField> &v, int j, int Nblock, FermionField &tmpv_5d, std::vector<FermionField> &vec_v)
{
for (unsigned int nj = 0; nj < Nblock; nj++)
{
vec_v[nj] = v[j+nj];
}
}
template <typename FImpl>
void TMesonFieldGamma<FImpl>::gammaMult(std::vector<FermionField> &v, Gamma gamma)
{
int Nblock = v.size();
for (unsigned int nj = 0; nj < Nblock; nj++)
{
v[nj] = gamma * v[nj];
}
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TMesonFieldGamma<FImpl>::setup(void)
{
int nt = env().getDim(Tp);
int N = par().N;
int Nblock = par().Nblock;
int Ls_ = env().getObjectLs(par().A2A1 + "_class");
envTmpLat(FermionField, "tmpv_5d", Ls_);
envTmpLat(FermionField, "tmpw_5d", Ls_);
envTmp(std::vector<FermionField>, "w", 1, N, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "v", 1, N, FermionField(env().getGrid(1)));
envTmp(Eigen::MatrixXcd, "MF", 1, Eigen::MatrixXcd::Zero(nt, N * N));
envTmp(std::vector<FermionField>, "w_block", 1, Nblock, FermionField(env().getGrid(1)));
envTmp(std::vector<FermionField>, "v_block", 1, Nblock, FermionField(env().getGrid(1)));
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TMesonFieldGamma<FImpl>::execute(void)
{
LOG(Message) << "Computing A2A meson field for gamma = " << par().gammas << ", taking w from " << par().A2A1 << " and v from " << par().A2A2 << std::endl;
int N = par().N;
int nt = env().getDim(Tp);
int Nblock = par().Nblock;
std::vector<Result> result;
std::vector<Gamma::Algebra> gammaResultList;
std::vector<Gamma> gammaList;
parseGammaString(gammaResultList);
result.resize(gammaResultList.size());
Gamma g5(Gamma::Algebra::Gamma5);
gammaList.resize(gammaResultList.size(), g5);
for (unsigned int i = 0; i < result.size(); ++i)
{
result[i].gamma = gammaResultList[i];
result[i].MesonField.resize(N, std::vector<std::vector<ComplexD>>(N, std::vector<ComplexD>(nt)));
Gamma gamma(gammaResultList[i]);
gammaList[i] = gamma;
}
auto &a2a1 = envGet(A2ABase, par().A2A1 + "_class");
auto &a2a2 = envGet(A2ABase, par().A2A2 + "_class");
envGetTmp(FermionField, tmpv_5d);
envGetTmp(FermionField, tmpw_5d);
envGetTmp(std::vector<FermionField>, v);
envGetTmp(std::vector<FermionField>, w);
LOG(Message) << "Finding v and w vectors for N = " << N << std::endl;
for (int i = 0; i < N; i++)
{
a2a2.return_v(i, tmpv_5d, v[i]);
a2a1.return_w(i, tmpw_5d, w[i]);
}
LOG(Message) << "Found v and w vectors for N = " << N << std::endl;
std::vector<std::vector<ComplexD>> MesonField_ij;
LOG(Message) << "Before blocked MFs, Nblock = " << Nblock << std::endl;
envGetTmp(std::vector<FermionField>, v_block);
envGetTmp(std::vector<FermionField>, w_block);
MesonField_ij.resize(Nblock * Nblock, std::vector<ComplexD>(nt));
envGetTmp(Eigen::MatrixXcd, MF);
LOG(Message) << "Before blocked MFs, Nblock = " << Nblock << std::endl;
for (unsigned int i = 0; i < N; i += Nblock)
{
vectorOfWs(w, i, Nblock, tmpw_5d, w_block);
for (unsigned int j = 0; j < N; j += Nblock)
{
vectorOfVs(v, j, Nblock, tmpv_5d, v_block);
for (unsigned int k = 0; k < result.size(); k++)
{
gammaMult(v_block, gammaList[k]);
sliceInnerProductMesonField(MesonField_ij, w_block, v_block, Tp);
for (unsigned int nj = 0; nj < Nblock; nj++)
{
for (unsigned int ni = 0; ni < Nblock; ni++)
{
MF.col((i + ni) + (j + nj) * N) = Eigen::VectorXcd::Map(&MesonField_ij[nj * Nblock + ni][0], MesonField_ij[nj * Nblock + ni].size());
}
}
}
}
if (i % 10 == 0)
{
LOG(Message) << "MF for i = " << i << " of " << N << std::endl;
}
}
LOG(Message) << "Before Global sum, Nblock = " << Nblock << std::endl;
v_block[0]._grid->GlobalSumVector(MF.data(), MF.size());
LOG(Message) << "After Global sum, Nblock = " << Nblock << std::endl;
for (unsigned int i = 0; i < N; i++)
{
for (unsigned int j = 0; j < N; j++)
{
for (unsigned int k = 0; k < result.size(); k++)
{
for (unsigned int t = 0; t < nt; t++)
{
result[k].MesonField[i][j][t] = MF.col(i + N * j)[t];
}
}
}
}
saveResult(par().output, "meson", result);
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_MesonFieldGm_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WardIdentity.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/WardIdentity.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
template class Grid::Hadrons::MContraction::TWardIdentity<FIMPL>;

View File

@@ -0,0 +1,224 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WardIdentity.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_WardIdentity_hpp_
#define Hadrons_MContraction_WardIdentity_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/*
Ward Identity contractions
-----------------------------
* options:
- q: propagator, 5D if available (string)
- action: action module used for propagator solution (string)
- mass: mass of quark (double)
- test_axial: whether or not to test PCAC relation.
*/
/******************************************************************************
* WardIdentity *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
class WardIdentityPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(WardIdentityPar,
std::string, q,
std::string, action,
double, mass,
bool, test_axial);
};
template <typename FImpl>
class TWardIdentity: public Module<WardIdentityPar>
{
public:
FERM_TYPE_ALIASES(FImpl,);
public:
// constructor
TWardIdentity(const std::string name);
// destructor
virtual ~TWardIdentity(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
private:
unsigned int Ls_;
};
MODULE_REGISTER_TMP(WardIdentity, TWardIdentity<FIMPL>, MContraction);
/******************************************************************************
* TWardIdentity implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TWardIdentity<FImpl>::TWardIdentity(const std::string name)
: Module<WardIdentityPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TWardIdentity<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().q, par().action};
return in;
}
template <typename FImpl>
std::vector<std::string> TWardIdentity<FImpl>::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TWardIdentity<FImpl>::setup(void)
{
Ls_ = env().getObjectLs(par().q);
if (Ls_ != env().getObjectLs(par().action))
{
HADRONS_ERROR(Size, "Ls mismatch between quark action and propagator");
}
envTmpLat(PropagatorField, "tmp");
envTmpLat(PropagatorField, "vector_WI");
if (par().test_axial)
{
envTmpLat(PropagatorField, "psi");
envTmpLat(LatticeComplex, "PP");
envTmpLat(LatticeComplex, "axial_defect");
envTmpLat(LatticeComplex, "PJ5q");
}
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TWardIdentity<FImpl>::execute(void)
{
LOG(Message) << "Performing Ward Identity checks for quark '" << par().q
<< "'." << std::endl;
auto &q = envGet(PropagatorField, par().q);
auto &act = envGet(FMat, par().action);
Gamma g5(Gamma::Algebra::Gamma5);
// Compute D_mu V_mu, D here is backward derivative.
envGetTmp(PropagatorField, tmp);
envGetTmp(PropagatorField, vector_WI);
vector_WI = zero;
for (unsigned int mu = 0; mu < Nd; ++mu)
{
act.ContractConservedCurrent(q, q, tmp, Current::Vector, mu);
tmp -= Cshift(tmp, mu, -1);
vector_WI += tmp;
}
// Test ward identity D_mu V_mu = 0;
LOG(Message) << "Vector Ward Identity check Delta_mu V_mu = "
<< norm2(vector_WI) << std::endl;
if (par().test_axial)
{
envGetTmp(PropagatorField, psi);
envGetTmp(LatticeComplex, PP);
envGetTmp(LatticeComplex, axial_defect);
envGetTmp(LatticeComplex, PJ5q);
std::vector<TComplex> axial_buf;
// Compute <P|D_mu A_mu>, D is backwards derivative.
axial_defect = zero;
for (unsigned int mu = 0; mu < Nd; ++mu)
{
act.ContractConservedCurrent(q, q, tmp, Current::Axial, mu);
tmp -= Cshift(tmp, mu, -1);
axial_defect += trace(g5*tmp);
}
// Get <P|J5q> for 5D (zero for 4D) and <P|P>.
PJ5q = zero;
if (Ls_ > 1)
{
// <P|P>
ExtractSlice(tmp, q, 0, 0);
psi = 0.5 * (tmp - g5*tmp);
ExtractSlice(tmp, q, Ls_ - 1, 0);
psi += 0.5 * (tmp + g5*tmp);
PP = trace(adj(psi)*psi);
// <P|5Jq>
ExtractSlice(tmp, q, Ls_/2 - 1, 0);
psi = 0.5 * (tmp + g5*tmp);
ExtractSlice(tmp, q, Ls_/2, 0);
psi += 0.5 * (tmp - g5*tmp);
PJ5q = trace(adj(psi)*psi);
}
else
{
PP = trace(adj(q)*q);
}
// Test ward identity <P|D_mu A_mu> = 2m<P|P> + 2<P|J5q>
LOG(Message) << "|D_mu A_mu|^2 = " << norm2(axial_defect) << std::endl;
LOG(Message) << "|PP|^2 = " << norm2(PP) << std::endl;
LOG(Message) << "|PJ5q|^2 = " << norm2(PJ5q) << std::endl;
LOG(Message) << "Axial Ward Identity defect Delta_mu A_mu = "
<< norm2(axial_defect) << std::endl;
// Axial defect by timeslice.
axial_defect -= 2.*(par().mass*PP + PJ5q);
LOG(Message) << "Check Axial defect by timeslice" << std::endl;
sliceSum(axial_defect, axial_buf, Tp);
for (int t = 0; t < axial_buf.size(); ++t)
{
LOG(Message) << "t = " << t << ": "
<< TensorRemove(axial_buf[t]) << std::endl;
}
}
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_WardIdentity_hpp_

View File

@@ -0,0 +1,118 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakHamiltonian.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_WeakHamiltonian_hpp_
#define Hadrons_MContraction_WeakHamiltonian_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* WeakHamiltonian *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
/*******************************************************************************
* Utilities for contractions involving the Weak Hamiltonian.
******************************************************************************/
//// Sum and store correlator.
#define MAKE_DIAG(exp, buf, res, n)\
sliceSum(exp, buf, Tp);\
res.name = (n);\
res.corr.resize(buf.size());\
for (unsigned int t = 0; t < buf.size(); ++t)\
{\
res.corr[t] = TensorRemove(buf[t]);\
}
//// Contraction of mu index: use 'mu' variable in exp.
#define SUM_MU(buf,exp)\
buf = zero;\
for (unsigned int mu = 0; mu < ndim; ++mu)\
{\
buf += exp;\
}
enum
{
i_V = 0,
i_A = 1,
n_i = 2
};
class WeakHamiltonianPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(WeakHamiltonianPar,
std::string, q1,
std::string, q2,
std::string, q3,
std::string, q4,
unsigned int, tSnk,
std::string, output);
};
#define MAKE_WEAK_MODULE(modname)\
class T##modname: public Module<WeakHamiltonianPar>\
{\
public:\
FERM_TYPE_ALIASES(FIMPL,)\
class Result: Serializable\
{\
public:\
GRID_SERIALIZABLE_CLASS_MEMBERS(Result,\
std::string, name,\
std::vector<Complex>, corr);\
};\
public:\
/* constructor */ \
T##modname(const std::string name);\
/* destructor */ \
virtual ~T##modname(void) {};\
/* dependency relation */ \
virtual std::vector<std::string> getInput(void);\
virtual std::vector<std::string> getOutput(void);\
public:\
std::vector<std::string> VA_label = {"V", "A"};\
protected:\
/* setup */ \
virtual void setup(void);\
/* execution */ \
virtual void execute(void);\
};\
MODULE_REGISTER(modname, T##modname, MContraction);
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_WeakHamiltonian_hpp_

View File

@@ -0,0 +1,151 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakHamiltonianEye.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonianEye.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
/*
* Weak Hamiltonian current-current contractions, Eye-type.
*
* These contractions are generated by the Q1 and Q2 operators in the physical
* basis (see e.g. Fig 3 of arXiv:1507.03094).
*
* Schematics: q4 |
* /-<-¬ |
* / \ | q2 q3
* \ / | /----<------*------<----¬
* q2 \ / q3 | / /-*-¬ \
* /-----<-----* *-----<----¬ | / / \ \
* i * H_W * f | i * \ / q4 * f
* \ / | \ \->-/ /
* \ / | \ /
* \---------->---------/ | \----------->----------/
* q1 | q1
* |
* Saucer (S) | Eye (E)
*
* S: trace(q3*g5*q1*adj(q2)*g5*gL[mu][p_1]*q4*gL[mu][p_2])
* E: trace(q3*g5*q1*adj(q2)*g5*gL[mu][p_1])*trace(q4*gL[mu][p_2])
*
* Note q1 must be sink smeared.
*/
/******************************************************************************
* TWeakHamiltonianEye implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
TWeakHamiltonianEye::TWeakHamiltonianEye(const std::string name)
: Module<WeakHamiltonianPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
std::vector<std::string> TWeakHamiltonianEye::getInput(void)
{
std::vector<std::string> in = {par().q1, par().q2, par().q3, par().q4};
return in;
}
std::vector<std::string> TWeakHamiltonianEye::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
void TWeakHamiltonianEye::setup(void)
{
unsigned int ndim = env().getNd();
envTmpLat(LatticeComplex, "expbuf");
envTmpLat(PropagatorField, "tmp1");
envTmpLat(LatticeComplex, "tmp2");
envTmp(std::vector<PropagatorField>, "S_body", 1, ndim, PropagatorField(env().getGrid()));
envTmp(std::vector<PropagatorField>, "S_loop", 1, ndim, PropagatorField(env().getGrid()));
envTmp(std::vector<LatticeComplex>, "E_body", 1, ndim, LatticeComplex(env().getGrid()));
envTmp(std::vector<LatticeComplex>, "E_loop", 1, ndim, LatticeComplex(env().getGrid()));
}
// execution ///////////////////////////////////////////////////////////////////
void TWeakHamiltonianEye::execute(void)
{
LOG(Message) << "Computing Weak Hamiltonian (Eye type) contractions '"
<< getName() << "' using quarks '" << par().q1 << "', '"
<< par().q2 << ", '" << par().q3 << "' and '" << par().q4
<< "'." << std::endl;
auto &q1 = envGet(SlicedPropagator, par().q1);
auto &q2 = envGet(PropagatorField, par().q2);
auto &q3 = envGet(PropagatorField, par().q3);
auto &q4 = envGet(PropagatorField, par().q4);
Gamma g5 = Gamma(Gamma::Algebra::Gamma5);
std::vector<TComplex> corrbuf;
std::vector<Result> result(n_eye_diag);
unsigned int ndim = env().getNd();
envGetTmp(LatticeComplex, expbuf);
envGetTmp(PropagatorField, tmp1);
envGetTmp(LatticeComplex, tmp2);
envGetTmp(std::vector<PropagatorField>, S_body);
envGetTmp(std::vector<PropagatorField>, S_loop);
envGetTmp(std::vector<LatticeComplex>, E_body);
envGetTmp(std::vector<LatticeComplex>, E_loop);
// Get sink timeslice of q1.
SitePropagator q1Snk = q1[par().tSnk];
// Setup for S-type contractions.
for (int mu = 0; mu < ndim; ++mu)
{
S_body[mu] = MAKE_SE_BODY(q1Snk, q2, q3, GammaL(Gamma::gmu[mu]));
S_loop[mu] = MAKE_SE_LOOP(q4, GammaL(Gamma::gmu[mu]));
}
// Perform S-type contractions.
SUM_MU(expbuf, trace(S_body[mu]*S_loop[mu]))
MAKE_DIAG(expbuf, corrbuf, result[S_diag], "HW_S")
// Recycle sub-expressions for E-type contractions.
for (unsigned int mu = 0; mu < ndim; ++mu)
{
E_body[mu] = trace(S_body[mu]);
E_loop[mu] = trace(S_loop[mu]);
}
// Perform E-type contractions.
SUM_MU(expbuf, E_body[mu]*E_loop[mu])
MAKE_DIAG(expbuf, corrbuf, result[E_diag], "HW_E")
// IO
saveResult(par().output, "HW_Eye", result);
}

View File

@@ -0,0 +1,59 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakHamiltonianEye.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_WeakHamiltonianEye_hpp_
#define Hadrons_MContraction_WeakHamiltonianEye_hpp_
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonian.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* WeakHamiltonianEye *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
enum
{
S_diag = 0,
E_diag = 1,
n_eye_diag = 2
};
// Saucer and Eye subdiagram contractions.
#define MAKE_SE_BODY(Q_1, Q_2, Q_3, gamma) (Q_3*g5*Q_1*adj(Q_2)*g5*gamma)
#define MAKE_SE_LOOP(Q_loop, gamma) (Q_loop*gamma)
MAKE_WEAK_MODULE(WeakHamiltonianEye)
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_WeakHamiltonianEye_hpp_

View File

@@ -0,0 +1,148 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakHamiltonianNonEye.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonianNonEye.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
/*
* Weak Hamiltonian current-current contractions, Non-Eye-type.
*
* These contractions are generated by the Q1 and Q2 operators in the physical
* basis (see e.g. Fig 3 of arXiv:1507.03094).
*
* Schematic:
* q2 q3 | q2 q3
* /--<--¬ /--<--¬ | /--<--¬ /--<--¬
* / \ / \ | / \ / \
* / \ / \ | / \ / \
* / \ / \ | / \ / \
* i * * H_W * f | i * * * H_W * f
* \ * | | \ / \ /
* \ / \ / | \ / \ /
* \ / \ / | \ / \ /
* \ / \ / | \-->--/ \-->--/
* \-->--/ \-->--/ | q1 q4
* q1 q4 |
* Connected (C) | Wing (W)
*
* C: trace(q1*adj(q2)*g5*gL[mu]*q3*adj(q4)*g5*gL[mu])
* W: trace(q1*adj(q2)*g5*gL[mu])*trace(q3*adj(q4)*g5*gL[mu])
*
*/
/******************************************************************************
* TWeakHamiltonianNonEye implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
TWeakHamiltonianNonEye::TWeakHamiltonianNonEye(const std::string name)
: Module<WeakHamiltonianPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
std::vector<std::string> TWeakHamiltonianNonEye::getInput(void)
{
std::vector<std::string> in = {par().q1, par().q2, par().q3, par().q4};
return in;
}
std::vector<std::string> TWeakHamiltonianNonEye::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
void TWeakHamiltonianNonEye::setup(void)
{
unsigned int ndim = env().getNd();
envTmpLat(LatticeComplex, "expbuf");
envTmpLat(PropagatorField, "tmp1");
envTmpLat(LatticeComplex, "tmp2");
envTmp(std::vector<PropagatorField>, "C_i_side_loop", 1, ndim, PropagatorField(env().getGrid()));
envTmp(std::vector<PropagatorField>, "C_f_side_loop", 1, ndim, PropagatorField(env().getGrid()));
envTmp(std::vector<LatticeComplex>, "W_i_side_loop", 1, ndim, LatticeComplex(env().getGrid()));
envTmp(std::vector<LatticeComplex>, "W_f_side_loop", 1, ndim, LatticeComplex(env().getGrid()));
}
// execution ///////////////////////////////////////////////////////////////////
void TWeakHamiltonianNonEye::execute(void)
{
LOG(Message) << "Computing Weak Hamiltonian (Non-Eye type) contractions '"
<< getName() << "' using quarks '" << par().q1 << "', '"
<< par().q2 << ", '" << par().q3 << "' and '" << par().q4
<< "'." << std::endl;
auto &q1 = envGet(PropagatorField, par().q1);
auto &q2 = envGet(PropagatorField, par().q2);
auto &q3 = envGet(PropagatorField, par().q3);
auto &q4 = envGet(PropagatorField, par().q4);
Gamma g5 = Gamma(Gamma::Algebra::Gamma5);
std::vector<TComplex> corrbuf;
std::vector<Result> result(n_noneye_diag);
unsigned int ndim = env().getNd();
envGetTmp(LatticeComplex, expbuf);
envGetTmp(PropagatorField, tmp1);
envGetTmp(LatticeComplex, tmp2);
envGetTmp(std::vector<PropagatorField>, C_i_side_loop);
envGetTmp(std::vector<PropagatorField>, C_f_side_loop);
envGetTmp(std::vector<LatticeComplex>, W_i_side_loop);
envGetTmp(std::vector<LatticeComplex>, W_f_side_loop);
// Setup for C-type contractions.
for (int mu = 0; mu < ndim; ++mu)
{
C_i_side_loop[mu] = MAKE_CW_SUBDIAG(q1, q2, GammaL(Gamma::gmu[mu]));
C_f_side_loop[mu] = MAKE_CW_SUBDIAG(q3, q4, GammaL(Gamma::gmu[mu]));
}
// Perform C-type contractions.
SUM_MU(expbuf, trace(C_i_side_loop[mu]*C_f_side_loop[mu]))
MAKE_DIAG(expbuf, corrbuf, result[C_diag], "HW_C")
// Recycle sub-expressions for W-type contractions.
for (unsigned int mu = 0; mu < ndim; ++mu)
{
W_i_side_loop[mu] = trace(C_i_side_loop[mu]);
W_f_side_loop[mu] = trace(C_f_side_loop[mu]);
}
// Perform W-type contractions.
SUM_MU(expbuf, W_i_side_loop[mu]*W_f_side_loop[mu])
MAKE_DIAG(expbuf, corrbuf, result[W_diag], "HW_W")
// IO
saveResult(par().output, "HW_NonEye", result);
}

View File

@@ -0,0 +1,58 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakHamiltonianNonEye.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_WeakHamiltonianNonEye_hpp_
#define Hadrons_MContraction_WeakHamiltonianNonEye_hpp_
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonian.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* WeakHamiltonianNonEye *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
enum
{
W_diag = 0,
C_diag = 1,
n_noneye_diag = 2
};
// Wing and Connected subdiagram contractions
#define MAKE_CW_SUBDIAG(Q_1, Q_2, gamma) (Q_1*adj(Q_2)*g5*gamma)
MAKE_WEAK_MODULE(WeakHamiltonianNonEye)
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_WeakHamiltonianNonEye_hpp_

View File

@@ -0,0 +1,142 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakNeutral4ptDisc.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MContraction/WeakNeutral4ptDisc.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MContraction;
/*
* Weak Hamiltonian + current contractions, disconnected topology for neutral
* mesons.
*
* These contractions are generated by operators Q_1,...,10 of the dS=1 Weak
* Hamiltonian in the physical basis and an additional current J (see e.g.
* Fig 11 of arXiv:1507.03094).
*
* Schematic:
*
* q2 q4 q3
* /--<--¬ /---<--¬ /---<--¬
* / \ / \ / \
* i * * H_W | J * * f
* \ / \ / \ /
* \--->---/ \-------/ \------/
* q1
*
* options
* - q1: input propagator 1 (string)
* - q2: input propagator 2 (string)
* - q3: input propagator 3 (string), assumed to be sequential propagator
* - q4: input propagator 4 (string), assumed to be a loop
*
* type 1: trace(q1*adj(q2)*g5*gL[mu])*trace(loop*gL[mu])*trace(q3*g5)
* type 2: trace(q1*adj(q2)*g5*gL[mu]*loop*gL[mu])*trace(q3*g5)
*/
/*******************************************************************************
* TWeakNeutral4ptDisc implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
TWeakNeutral4ptDisc::TWeakNeutral4ptDisc(const std::string name)
: Module<WeakHamiltonianPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
std::vector<std::string> TWeakNeutral4ptDisc::getInput(void)
{
std::vector<std::string> in = {par().q1, par().q2, par().q3, par().q4};
return in;
}
std::vector<std::string> TWeakNeutral4ptDisc::getOutput(void)
{
std::vector<std::string> out = {};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
void TWeakNeutral4ptDisc::setup(void)
{
unsigned int ndim = env().getNd();
envTmpLat(LatticeComplex, "expbuf");
envTmpLat(PropagatorField, "tmp");
envTmpLat(LatticeComplex, "curr");
envTmp(std::vector<PropagatorField>, "meson", 1, ndim, PropagatorField(env().getGrid()));
envTmp(std::vector<PropagatorField>, "loop", 1, ndim, PropagatorField(env().getGrid()));
}
// execution ///////////////////////////////////////////////////////////////////
void TWeakNeutral4ptDisc::execute(void)
{
LOG(Message) << "Computing Weak Hamiltonian neutral disconnected contractions '"
<< getName() << "' using quarks '" << par().q1 << "', '"
<< par().q2 << ", '" << par().q3 << "' and '" << par().q4
<< "'." << std::endl;
auto &q1 = envGet(PropagatorField, par().q1);
auto &q2 = envGet(PropagatorField, par().q2);
auto &q3 = envGet(PropagatorField, par().q3);
auto &q4 = envGet(PropagatorField, par().q4);
Gamma g5 = Gamma(Gamma::Algebra::Gamma5);
std::vector<TComplex> corrbuf;
std::vector<Result> result(n_neut_disc_diag);
unsigned int ndim = env().getNd();
envGetTmp(LatticeComplex, expbuf);
envGetTmp(PropagatorField, tmp);
envGetTmp(LatticeComplex, curr);
envGetTmp(std::vector<PropagatorField>, meson);
envGetTmp(std::vector<PropagatorField>, loop);
// Setup for type 1 contractions.
for (int mu = 0; mu < ndim; ++mu)
{
meson[mu] = MAKE_DISC_MESON(q1, q2, GammaL(Gamma::gmu[mu]));
loop[mu] = MAKE_DISC_LOOP(q4, GammaL(Gamma::gmu[mu]));
}
curr = MAKE_DISC_CURR(q3, GammaL(Gamma::Algebra::Gamma5));
// Perform type 1 contractions.
SUM_MU(expbuf, trace(meson[mu]*loop[mu]))
expbuf *= curr;
MAKE_DIAG(expbuf, corrbuf, result[neut_disc_1_diag], "HW_disc0_1")
// Perform type 2 contractions.
SUM_MU(expbuf, trace(meson[mu])*trace(loop[mu]))
expbuf *= curr;
MAKE_DIAG(expbuf, corrbuf, result[neut_disc_2_diag], "HW_disc0_2")
// IO
saveResult(par().output, "HW_disc0", result);
}

View File

@@ -0,0 +1,60 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MContraction/WeakNeutral4ptDisc.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Lanny91 <andrew.lawson@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MContraction_WeakNeutral4ptDisc_hpp_
#define Hadrons_MContraction_WeakNeutral4ptDisc_hpp_
#include <Grid/Hadrons/Modules/MContraction/WeakHamiltonian.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* WeakNeutral4ptDisc *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MContraction)
enum
{
neut_disc_1_diag = 0,
neut_disc_2_diag = 1,
n_neut_disc_diag = 2
};
// Neutral 4pt disconnected subdiagram contractions.
#define MAKE_DISC_MESON(Q_1, Q_2, gamma) (Q_1*adj(Q_2)*g5*gamma)
#define MAKE_DISC_LOOP(Q_LOOP, gamma) (Q_LOOP*gamma)
#define MAKE_DISC_CURR(Q_c, gamma) (trace(Q_c*gamma))
MAKE_WEAK_MODULE(WeakNeutral4ptDisc)
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MContraction_WeakNeutral4ptDisc_hpp_

View File

@@ -0,0 +1,36 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MFermion/FreeProp.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Vera Guelpers <V.M.Guelpers@soton.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MFermion/FreeProp.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MFermion;
template class Grid::Hadrons::MFermion::TFreeProp<FIMPL>;

View File

@@ -0,0 +1,187 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MFermion/FreeProp.hpp
Copyright (C) 2015-2018
Author: Vera Guelpers <V.M.Guelpers@soton.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MFermion_FreeProp_hpp_
#define Hadrons_MFermion_FreeProp_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* FreeProp *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MFermion)
class FreePropPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(FreePropPar,
std::string, source,
std::string, action,
double, mass,
std::string, twist);
};
template <typename FImpl>
class TFreeProp: public Module<FreePropPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
public:
// constructor
TFreeProp(const std::string name);
// destructor
virtual ~TFreeProp(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
private:
unsigned int Ls_;
};
MODULE_REGISTER_TMP(FreeProp, TFreeProp<FIMPL>, MFermion);
/******************************************************************************
* TFreeProp implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TFreeProp<FImpl>::TFreeProp(const std::string name)
: Module<FreePropPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TFreeProp<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().source, par().action};
return in;
}
template <typename FImpl>
std::vector<std::string> TFreeProp<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName(), getName() + "_5d"};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TFreeProp<FImpl>::setup(void)
{
Ls_ = env().getObjectLs(par().action);
envCreateLat(PropagatorField, getName());
envTmpLat(FermionField, "source", Ls_);
envTmpLat(FermionField, "sol", Ls_);
envTmpLat(FermionField, "tmp");
if (Ls_ > 1)
{
envCreateLat(PropagatorField, getName() + "_5d", Ls_);
}
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TFreeProp<FImpl>::execute(void)
{
LOG(Message) << "Computing free fermion propagator '" << getName() << "'"
<< std::endl;
std::string propName = (Ls_ == 1) ? getName() : (getName() + "_5d");
auto &prop = envGet(PropagatorField, propName);
auto &fullSrc = envGet(PropagatorField, par().source);
auto &mat = envGet(FMat, par().action);
RealD mass = par().mass;
envGetTmp(FermionField, source);
envGetTmp(FermionField, sol);
envGetTmp(FermionField, tmp);
LOG(Message) << "Calculating a free Propagator with mass " << mass
<< " using the action '" << par().action
<< "' on source '" << par().source << "'" << std::endl;
for (unsigned int s = 0; s < Ns; ++s)
for (unsigned int c = 0; c < FImpl::Dimension; ++c)
{
LOG(Message) << "Calculation for spin= " << s << ", color= " << c
<< std::endl;
// source conversion for 4D sources
if (!env().isObject5d(par().source))
{
if (Ls_ == 1)
{
PropToFerm<FImpl>(source, fullSrc, s, c);
}
else
{
PropToFerm<FImpl>(tmp, fullSrc, s, c);
mat.ImportPhysicalFermionSource(tmp, source);
}
}
// source conversion for 5D sources
else
{
if (Ls_ != env().getObjectLs(par().source))
{
HADRONS_ERROR(Size, "Ls mismatch between quark action and source");
}
else
{
PropToFerm<FImpl>(source, fullSrc, s, c);
}
}
sol = zero;
std::vector<Real> twist = strToVec<Real>(par().twist);
if(twist.size() != Nd) HADRONS_ERROR(Size, "number of twist angles does not match number of dimensions");
mat.FreePropagator(source,sol,mass,twist);
FermToProp<FImpl>(prop, sol, s, c);
// create 4D propagators from 5D one if necessary
if (Ls_ > 1)
{
PropagatorField &p4d = envGet(PropagatorField, getName());
mat.ExportPhysicalFermionSolution(sol, tmp);
FermToProp<FImpl>(p4d, tmp, s, c);
}
}
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MFermion_FreeProp_hpp_

View File

@@ -0,0 +1,35 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MFermion/GaugeProp.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MFermion/GaugeProp.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MFermion;
template class Grid::Hadrons::MFermion::TGaugeProp<FIMPL>;
template class Grid::Hadrons::MFermion::TGaugeProp<ZFIMPL>;

View File

@@ -0,0 +1,191 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MFermion/GaugeProp.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Guido Cossu <guido.cossu@ed.ac.uk>
Author: Lanny91 <andrew.lawson@gmail.com>
Author: pretidav <david.preti@csic.es>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MFermion_GaugeProp_hpp_
#define Hadrons_MFermion_GaugeProp_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
#include <Grid/Hadrons/Solver.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* GaugeProp *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MFermion)
class GaugePropPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(GaugePropPar,
std::string, source,
std::string, solver);
};
template <typename FImpl>
class TGaugeProp: public Module<GaugePropPar>
{
public:
FG_TYPE_ALIASES(FImpl,);
SOLVER_TYPE_ALIASES(FImpl,);
public:
// constructor
TGaugeProp(const std::string name);
// destructor
virtual ~TGaugeProp(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
private:
unsigned int Ls_;
Solver *solver_{nullptr};
};
MODULE_REGISTER_TMP(GaugeProp, TGaugeProp<FIMPL>, MFermion);
MODULE_REGISTER_TMP(ZGaugeProp, TGaugeProp<ZFIMPL>, MFermion);
/******************************************************************************
* TGaugeProp implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename FImpl>
TGaugeProp<FImpl>::TGaugeProp(const std::string name)
: Module<GaugePropPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename FImpl>
std::vector<std::string> TGaugeProp<FImpl>::getInput(void)
{
std::vector<std::string> in = {par().source, par().solver};
return in;
}
template <typename FImpl>
std::vector<std::string> TGaugeProp<FImpl>::getOutput(void)
{
std::vector<std::string> out = {getName(), getName() + "_5d"};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename FImpl>
void TGaugeProp<FImpl>::setup(void)
{
Ls_ = env().getObjectLs(par().solver);
envCreateLat(PropagatorField, getName());
envTmpLat(FermionField, "source", Ls_);
envTmpLat(FermionField, "sol", Ls_);
envTmpLat(FermionField, "tmp");
if (Ls_ > 1)
{
envCreateLat(PropagatorField, getName() + "_5d", Ls_);
}
}
// execution ///////////////////////////////////////////////////////////////////
template <typename FImpl>
void TGaugeProp<FImpl>::execute(void)
{
LOG(Message) << "Computing quark propagator '" << getName() << "'"
<< std::endl;
std::string propName = (Ls_ == 1) ? getName() : (getName() + "_5d");
auto &prop = envGet(PropagatorField, propName);
auto &fullSrc = envGet(PropagatorField, par().source);
auto &solver = envGet(Solver, par().solver);
auto &mat = solver.getFMat();
envGetTmp(FermionField, source);
envGetTmp(FermionField, sol);
envGetTmp(FermionField, tmp);
LOG(Message) << "Inverting using solver '" << par().solver
<< "' on source '" << par().source << "'" << std::endl;
for (unsigned int s = 0; s < Ns; ++s)
for (unsigned int c = 0; c < FImpl::Dimension; ++c)
{
LOG(Message) << "Inversion for spin= " << s << ", color= " << c
<< std::endl;
// source conversion for 4D sources
LOG(Message) << "Import source" << std::endl;
if (!env().isObject5d(par().source))
{
if (Ls_ == 1)
{
PropToFerm<FImpl>(source, fullSrc, s, c);
}
else
{
PropToFerm<FImpl>(tmp, fullSrc, s, c);
mat.ImportPhysicalFermionSource(tmp, source);
}
}
// source conversion for 5D sources
else
{
if (Ls_ != env().getObjectLs(par().source))
{
HADRONS_ERROR(Size, "Ls mismatch between quark action and source");
}
else
{
PropToFerm<FImpl>(source, fullSrc, s, c);
}
}
LOG(Message) << "Solve" << std::endl;
sol = zero;
solver(sol, source);
LOG(Message) << "Export solution" << std::endl;
FermToProp<FImpl>(prop, sol, s, c);
// create 4D propagators from 5D one if necessary
if (Ls_ > 1)
{
PropagatorField &p4d = envGet(PropagatorField, getName());
mat.ExportPhysicalFermionSolution(sol, tmp);
FermToProp<FImpl>(p4d, tmp, s, c);
}
}
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MFermion_GaugeProp_hpp_

View File

@@ -0,0 +1,79 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/FundtoHirep.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: Guido Cossu <guido.cossu@ed.ac.uk>
Author: pretidav <david.preti@csic.es>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MGauge/FundtoHirep.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MGauge;
// constructor /////////////////////////////////////////////////////////////////
template <class Rep>
TFundtoHirep<Rep>::TFundtoHirep(const std::string name)
: Module<FundtoHirepPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <class Rep>
std::vector<std::string> TFundtoHirep<Rep>::getInput(void)
{
std::vector<std::string> in = {par().gaugeconf};
return in;
}
template <class Rep>
std::vector<std::string> TFundtoHirep<Rep>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename Rep>
void TFundtoHirep<Rep>::setup(void)
{
envCreateLat(Rep::LatticeField, getName());
}
// execution ///////////////////////////////////////////////////////////////////
template <class Rep>
void TFundtoHirep<Rep>::execute(void)
{
LOG(Message) << "Transforming Representation" << std::endl;
auto &U = envGet(LatticeGaugeField, par().gaugeconf);
auto &URep = envGet(Rep::LatticeField, getName());
Rep TargetRepresentation(U._grid);
TargetRepresentation.update_representation(U);
URep = TargetRepresentation.U;
}

View File

@@ -0,0 +1,76 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/FundtoHirep.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: pretidav <david.preti@csic.es>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MGauge_FundtoHirep_hpp_
#define Hadrons_MGauge_FundtoHirep_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Load a NERSC configuration *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MGauge)
class FundtoHirepPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(FundtoHirepPar,
std::string, gaugeconf);
};
template <class Rep>
class TFundtoHirep: public Module<FundtoHirepPar>
{
public:
// constructor
TFundtoHirep(const std::string name);
// destructor
virtual ~TFundtoHirep(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
void setup(void);
// execution
void execute(void);
};
//MODULE_REGISTER_TMP(FundtoAdjoint, TFundtoHirep<AdjointRepresentation>, MGauge);
//MODULE_REGISTER_TMP(FundtoTwoIndexSym, TFundtoHirep<TwoIndexSymmetricRepresentation>, MGauge);
//MODULE_REGISTER_TMP(FundtoTwoIndexAsym, TFundtoHirep<TwoIndexAntiSymmetricRepresentation>, MGauge);
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MGauge_FundtoHirep_hpp_

View File

@@ -0,0 +1,71 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/Random.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MGauge/Random.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MGauge;
/******************************************************************************
* TRandom implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
TRandom::TRandom(const std::string name)
: Module<NoPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
std::vector<std::string> TRandom::getInput(void)
{
std::vector<std::string> in;
return in;
}
std::vector<std::string> TRandom::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
void TRandom::setup(void)
{
envCreateLat(LatticeGaugeField, getName());
}
// execution ///////////////////////////////////////////////////////////////////
void TRandom::execute(void)
{
LOG(Message) << "Generating random gauge configuration" << std::endl;
auto &U = envGet(LatticeGaugeField, getName());
SU3::HotConfiguration(*env().get4dRng(), U);
}

View File

@@ -0,0 +1,66 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/Random.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MGauge_Random_hpp_
#define Hadrons_MGauge_Random_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Random gauge *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MGauge)
class TRandom: public Module<NoPar>
{
public:
// constructor
TRandom(const std::string name);
// destructor
virtual ~TRandom(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER(Random, TRandom, MGauge);
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MGauge_Random_hpp_

View File

@@ -0,0 +1,85 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/StochEm.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: James Harrison <j.harrison@soton.ac.uk>
Author: Vera Guelpers <vmg1n14@soton.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MGauge/StochEm.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MGauge;
/******************************************************************************
* TStochEm implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
TStochEm::TStochEm(const std::string name)
: Module<StochEmPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
std::vector<std::string> TStochEm::getInput(void)
{
std::vector<std::string> in;
return in;
}
std::vector<std::string> TStochEm::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
void TStochEm::setup(void)
{
weightDone_ = env().hasCreatedObject("_" + getName() + "_weight");
envCacheLat(EmComp, "_" + getName() + "_weight");
envCreateLat(EmField, getName());
}
// execution ///////////////////////////////////////////////////////////////////
void TStochEm::execute(void)
{
LOG(Message) << "Generating stochastic EM potential..." << std::endl;
std::vector<Real> improvements = strToVec<Real>(par().improvement);
PhotonR photon(par().gauge, par().zmScheme, improvements, par().G0_qedInf);
auto &a = envGet(EmField, getName());
auto &w = envGet(EmComp, "_" + getName() + "_weight");
if (!weightDone_)
{
LOG(Message) << "Caching stochastic EM potential weight (gauge: "
<< par().gauge << ", zero-mode scheme: "
<< par().zmScheme << ")..." << std::endl;
photon.StochasticWeight(w);
}
photon.StochasticField(a, *env().get4dRng(), w);
}

View File

@@ -0,0 +1,82 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/StochEm.hpp
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
Author: James Harrison <j.harrison@soton.ac.uk>
Author: Vera Guelpers <vmg1n14@soton.ac.uk>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#ifndef Hadrons_MGauge_StochEm_hpp_
#define Hadrons_MGauge_StochEm_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* StochEm *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MGauge)
class StochEmPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(StochEmPar,
PhotonR::Gauge, gauge,
PhotonR::ZmScheme, zmScheme,
std::string, improvement,
Real, G0_qedInf);
};
class TStochEm: public Module<StochEmPar>
{
public:
typedef PhotonR::GaugeField EmField;
typedef PhotonR::GaugeLinkField EmComp;
public:
// constructor
TStochEm(const std::string name);
// destructor
virtual ~TStochEm(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
protected:
// setup
virtual void setup(void);
// execution
virtual void execute(void);
private:
bool weightDone_;
};
MODULE_REGISTER(StochEm, TStochEm, MGauge);
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MGauge_StochEm_hpp_

View File

@@ -0,0 +1,7 @@
#include <Grid/Hadrons/Modules/MGauge/StoutSmearing.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MGauge;
template class Grid::Hadrons::MGauge::TStoutSmearing<GIMPL>;

View File

@@ -0,0 +1,104 @@
#ifndef Hadrons_MGauge_StoutSmearing_hpp_
#define Hadrons_MGauge_StoutSmearing_hpp_
#include <Grid/Hadrons/Global.hpp>
#include <Grid/Hadrons/Module.hpp>
#include <Grid/Hadrons/ModuleFactory.hpp>
BEGIN_HADRONS_NAMESPACE
/******************************************************************************
* Stout smearing *
******************************************************************************/
BEGIN_MODULE_NAMESPACE(MGauge)
class StoutSmearingPar: Serializable
{
public:
GRID_SERIALIZABLE_CLASS_MEMBERS(StoutSmearingPar,
std::string, gauge,
unsigned int, steps,
double, rho);
};
template <typename GImpl>
class TStoutSmearing: public Module<StoutSmearingPar>
{
public:
typedef typename GImpl::Field GaugeField;
public:
// constructor
TStoutSmearing(const std::string name);
// destructor
virtual ~TStoutSmearing(void) {};
// dependency relation
virtual std::vector<std::string> getInput(void);
virtual std::vector<std::string> getOutput(void);
// setup
virtual void setup(void);
// execution
virtual void execute(void);
};
MODULE_REGISTER_TMP(StoutSmearing, TStoutSmearing<GIMPL>, MGauge);
/******************************************************************************
* TStoutSmearing implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
template <typename GImpl>
TStoutSmearing<GImpl>::TStoutSmearing(const std::string name)
: Module<StoutSmearingPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
template <typename GImpl>
std::vector<std::string> TStoutSmearing<GImpl>::getInput(void)
{
std::vector<std::string> in = {par().gauge};
return in;
}
template <typename GImpl>
std::vector<std::string> TStoutSmearing<GImpl>::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
template <typename GImpl>
void TStoutSmearing<GImpl>::setup(void)
{
envCreateLat(GaugeField, getName());
envTmpLat(GaugeField, "buf");
}
// execution ///////////////////////////////////////////////////////////////////
template <typename GImpl>
void TStoutSmearing<GImpl>::execute(void)
{
LOG(Message) << "Smearing '" << par().gauge << "' with " << par().steps
<< " step" << ((par().steps > 1) ? "s" : "")
<< " of stout smearing and rho= " << par().rho << std::endl;
Smear_Stout<GImpl> smearer(par().rho);
auto &U = envGet(GaugeField, par().gauge);
auto &Usmr = envGet(GaugeField, getName());
envGetTmp(GaugeField, buf);
buf = U;
for (unsigned int n = 0; n < par().steps; ++n)
{
smearer.smear(Usmr, buf);
buf = Usmr;
}
}
END_MODULE_NAMESPACE
END_HADRONS_NAMESPACE
#endif // Hadrons_MGauge_StoutSmearing_hpp_

View File

@@ -0,0 +1,69 @@
/*************************************************************************************
Grid physics library, www.github.com/paboyle/Grid
Source file: extras/Hadrons/Modules/MGauge/Unit.cc
Copyright (C) 2015-2018
Author: Antonin Portelli <antonin.portelli@me.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
See the full license in the file "LICENSE" in the top level distribution directory
*************************************************************************************/
/* END LEGAL */
#include <Grid/Hadrons/Modules/MGauge/Unit.hpp>
using namespace Grid;
using namespace Hadrons;
using namespace MGauge;
/******************************************************************************
* TUnit implementation *
******************************************************************************/
// constructor /////////////////////////////////////////////////////////////////
TUnit::TUnit(const std::string name)
: Module<NoPar>(name)
{}
// dependencies/products ///////////////////////////////////////////////////////
std::vector<std::string> TUnit::getInput(void)
{
return std::vector<std::string>();
}
std::vector<std::string> TUnit::getOutput(void)
{
std::vector<std::string> out = {getName()};
return out;
}
// setup ///////////////////////////////////////////////////////////////////////
void TUnit::setup(void)
{
envCreateLat(LatticeGaugeField, getName());
}
// execution ///////////////////////////////////////////////////////////////////
void TUnit::execute(void)
{
LOG(Message) << "Creating unit gauge configuration" << std::endl;
auto &U = envGet(LatticeGaugeField, getName());
SU3::ColdConfiguration(*env().get4dRng(), U);
}

Some files were not shown because too many files have changed in this diff Show More