1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-16 02:35:36 +00:00
Commit Graph

45 Commits

Author SHA1 Message Date
paboyle
4e7ab3166f Refactoring header layout 2017-02-22 18:09:33 +00:00
paboyle
2c246551d0 Overlap comms and compute options in wilson kernels 2017-02-07 01:37:10 -05:00
Peter Boyle
fb8d4b2357 Lots of debug on performance Mobius 2016-12-08 17:28:28 +00:00
ca21003f01 Merge branch 'feature/fft-opt' into feature/feynman-rules
# Conflicts:
#	lib/FFT.h
#	lib/qcd/action/fermion/WilsonFermion5D.h
#	tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
997fd882ff Merge branch 'develop' into feature/feynman-rules
# Conflicts:
#	lib/Threads.h
#	lib/qcd/action/fermion/WilsonFermion.cc
#	lib/qcd/action/fermion/WilsonFermion.h
#	lib/qcd/utils/SUn.h
#	lib/simd/Grid_avx.h
#	lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
azusayamaguchi
81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
paboyle
3619167d62 Mass parameter 2016-10-10 23:47:33 +01:00
Guido Cossu
2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle
4089984431 Timing hooks 2016-10-06 09:25:12 +01:00
paboyle
b6713ecb60 Momentum space rules for Overlap, DWF untested to date 2016-09-26 09:39:09 +01:00
paboyle
3f2c44a5fe Updating the class to 5d selection based on impl type 2016-07-14 23:55:26 +01:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
paboyle
8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
paboyle
9b6ab6db16 simd in 5th dimension support 2016-04-19 15:38:01 -07:00
paboyle
b1192a8908 Benchmark_zmm added 2016-04-06 03:00:07 -07:00
paboyle
e67fc2be18 Adding a trial for openmp overhead minimisation 2016-03-31 16:00:37 +01:00
paboyle
dafc74020c Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori 2016-01-10 16:54:27 -08:00
paboyle
d19321dfde Overlap comms compute changes 2016-01-10 19:20:16 +00:00
paboyle
c99d748da6 Timing reports in benchmarks now reflect the asynch comms thread statistics 2016-01-04 14:42:16 +00:00
paboyle
02452afd36 Optional overlap of comms with compute 2016-01-04 14:18:40 +00:00
paboyle
331768dcff Added overlap comms compute mode 2016-01-03 01:38:11 +00:00
paboyle
aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
paboyle
899ca41cb8 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
Peter Boyle
28022755ae Stencil class name global change to StencilImpl typedef 2015-11-06 05:30:17 -06:00
paboyle
1159de165c Asm option for AVX512 2015-11-05 22:04:51 -08:00
Peter Boyle
f0e32f12cf Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-15 23:59:04 +01:00
Peter Boyle
55cfc89459 * Finished the template/policy style introduction of gparity, except the gparity force terms.
So valence sector looks ok.

FermionOperatorImpl.h provides the policy classes.

Expect HMC will introduce a smearing policy and a fermion representation change policy template
param. Will also probably need multi-precision work.

* HMC is running even-odd and non-checkerboarded (checked 4^4 wilson fermion/wilson gauge).

There appears to be a bug in the multi-level integrator -- <e-dH> passes with single level but
not with multi-level.

In any case there looks to be quite a bit to clean up.

This is the "const det" style implementation that is not appropriate  yet for clover since
it assumes that Mee is indept of the gauge fields. Easily fixed in future.
2015-08-15 23:25:49 +01:00
Peter Boyle
aeb7442d8f Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
84a66476ab Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
cc4ca48d13 Two flavour HMC for Wilson/Wilson is conserving energy.
Still to check plaq and <e(-dH)>, but nevertheless this is
progress
2015-07-29 17:53:39 +09:00
Peter Boyle
4fe110bd07 Two flavour HMC for Wilson/Wilson is conserving energy.
Still to check plaq and <e(-dH)>, but nevertheless this is
progress
2015-07-29 17:53:39 +09:00
Peter Boyle
d7e6b65a76 Elemental force term for Wilson dslash added and tests thereof passing.
Now need to construct pseudofermion two flavour, ratio, one flavour, ratio
action fragments.
2015-07-26 10:54:38 +09:00
Peter Boyle
d9d4c5916a Elemental force term for Wilson dslash added and tests thereof passing.
Now need to construct pseudofermion two flavour, ratio, one flavour, ratio
action fragments.
2015-07-26 10:54:38 +09:00
Peter Boyle
74e397b29c big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
98c817df1b big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
9e7035f5dc Conjugate residual algorithm; some more unary functions 2015-06-08 12:04:59 +01:00
Peter Boyle
d6f1ddf99c Conjugate residual algorithm; some more unary functions 2015-06-08 12:04:59 +01:00
Peter Boyle
b9e9777912 PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
and hermiticity tests.
2015-06-04 13:28:37 +01:00
Peter Boyle
63a61fcc2a PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
and hermiticity tests.
2015-06-04 13:28:37 +01:00
Peter Boyle
5916386242 Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
1fcacef239 Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
2583570e17 Domain wall fermions now invert ; have the basis set up for
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx        Representation               Kernel.

All are done with space-time taking part in checkerboarding, Ls uncheckerboarded

Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i)  Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.

That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
3845f267cb Domain wall fermions now invert ; have the basis set up for
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx        Representation               Kernel.

All are done with space-time taking part in checkerboarding, Ls uncheckerboarded

Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i)  Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.

That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00