1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-09-21 01:25:48 +01:00
Commit Graph

325 Commits

Author SHA1 Message Date
paboyle
165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e Build avx512 clean 2016-03-25 09:35:33 -07:00
paboyle
090e7aa930 Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle
325e745daa Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-02 07:04:03 -08:00
paboyle
61413565d0 Back off the inlined spin proj as not working 2016-03-02 07:03:09 -08:00
Antonin Portelli
497e7e4c53 BG/Q compatibility fix 2016-02-23 15:57:38 +00:00
Peter Boyle
6aeaf6f568 Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
turned up problems on the BlueWaters Cray.

Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
neo
771235017d Adding smearing routines (development) 2016-02-19 15:30:41 +09:00
paboyle
3425751cb8 Missing return value 2016-02-19 01:06:03 +00:00
Peter Boyle
22422a84d9 Small problem in compressor fix 2016-02-17 19:03:09 -06:00
Peter Boyle
c9fadf97a5 Simplify the compressor interface again. 2016-02-17 18:16:45 -06:00
Peter Boyle
81395e85d1 Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it. 2016-02-16 13:56:44 -06:00
Peter Boyle
a0fc47c6f9 Cheaper implementation 2016-02-15 16:02:36 -06:00
paboyle
e2f73e3ead Updates for shmem 2016-02-10 16:50:32 -08:00
neo
6371676a75 Correcting some compilation errors for clang-sse 2016-02-10 11:37:03 +09:00
Jung
bd84c23298 definitions reconciled. 2016-01-25 16:30:59 -05:00
Jung
7aa8d5e8af Faiing to compile, comparing with master 2016-01-25 16:03:02 -05:00
Jung
6012b0ec23 Checking in changes before changing to chulwoo-dec12-2015 2016-01-25 09:40:58 -05:00
Jung
411ac49dd7 GparityWilsonTM typedef added. Not yet tested
Conflicts:
	configure
	lib/qcd/action/fermion/WilsonKernels.h
2016-01-25 01:36:28 -05:00
Jung
5c57d4f403 Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
Conflicts:
	lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
paboyle
fc6ad65751 Pushed the overlap comms tweaks 2016-01-11 06:34:22 -08:00
paboyle
dafc74020c Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori 2016-01-10 16:54:27 -08:00
paboyle
d19321dfde Overlap comms compute changes 2016-01-10 19:20:16 +00:00
Jung
5924e5a562 Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
Conflicts:
	configure
	lib/qcd/action/Actions.h
	lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
c99d748da6 Timing reports in benchmarks now reflect the asynch comms thread statistics 2016-01-04 14:42:16 +00:00
paboyle
02452afd36 Optional overlap of comms with compute 2016-01-04 14:18:40 +00:00
paboyle
331768dcff Added overlap comms compute mode 2016-01-03 01:38:11 +00:00
paboyle
aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
paboyle
5a80930dd2 Charge conjugation boundary conditions for gauge fields implemented as a policy
class, changing the nature of covariant Cshifts used in
plaquettes, rectangles and staples.

As a result same code is used for the plaq and rect action independent of the BC type.

Should probably isolate the BC in a separate class that Gimpl takes as a template param.
Do the same with smearing policies.

This would then allow composition of BC with smearing etc....
2016-01-02 13:37:25 +00:00
paboyle
841a37f941 Fix to WilsonCompressor that fixes a bug in comms phase due to the sign change on gamma
matrix in hopping term.
Add logging of time spent in CG.
2015-12-29 23:49:41 +00:00
paboyle
e108e708a3 Wilson TM tests and compiles in 2015-12-17 23:06:33 +00:00
paboyle
67ccb043f1 Added TM fermions for DSDR etc.. 2015-12-17 22:34:28 +00:00
Jung
eb1759d7ea Added Gparity instantiation to no HANDOPT case
deleted configure (as intended?)
2015-12-16 00:04:09 -05:00
paboyle
34a0fde2ad Fixes to fermion force terms after sign of gamma_mu (0...3) change.
Thought I had already committed these.

Believe I have got the Gparity fermion force working.

* tests/Test_gpdwf_force.cc     -- correctly predicts dS for two flavour pseudofermion
                                   based on a small dt update of U field.

* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.

Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
Jung
bc34b7e808 Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
Conflicts:
	lib/qcd/action/fermion/WilsonKernels.h
	tests/Make.inc
2015-12-15 11:11:59 -05:00
Jung
284453c5e9 Added gparity mobius defs, added params to ScaledShamir
checking in before puling master
2015-12-14 12:15:06 -05:00
paboyle
3ce10aa975 Fix a regression failure on Mobius; chroma regression added 2015-12-10 22:55:00 +00:00
Jung
f2b4edc090 Fixes for Gparity comparison with CPS (Instantiation, Gamma matrix convention) 2015-12-07 02:04:57 -05:00
paboyle
b2c02a6106 Runs fastst on cori 2015-11-28 16:58:16 -08:00
paboyle
e9ff25b06b Small threading change makes a difference on Cori. 2015-11-07 00:07:05 -08:00
paboyle
05a7029600 Stencil change 2015-11-07 00:06:31 -08:00
paboyle
899ca41cb8 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
paboyle
d29b4c1dee Assembler files 2015-11-06 03:48:48 -08:00
paboyle
a2ff068e29 Asm and threading for many core 2015-11-06 03:47:14 -08:00
paboyle
17af18dcab Changes for AVX512 assembler 2015-11-06 03:45:51 -08:00
Peter Boyle
28022755ae Stencil class name global change to StencilImpl typedef 2015-11-06 05:30:17 -06:00
paboyle
1159de165c Asm option for AVX512 2015-11-05 22:04:51 -08:00
paboyle
16c7993434 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/simd/Grid_avx512.h
	lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle
4e65ad21ac Adding a routine for AVX512 / IMCI with explicit assembly implementations 2015-11-04 03:15:08 -08:00
Peter Boyle
abb23df83f formatting only 2015-11-04 10:00:27 +00:00
paboyle
1878bf97d0 Babbage fix 2015-09-30 16:04:01 -07:00
Peter Boyle
64d64d1ab6 Updating to modify non-inlining permute routines and hopefully get better reg use and
enhance performance.
2015-09-25 08:55:04 -07:00
Peter Boyle
2f38ebc446 Reintroducing the hand unrolled loops 2015-09-08 17:45:30 +01:00
Peter Boyle
a842a6c94d One flavour rational unprec added; untested but does compile.
Moving param structs into a single header for later connection to file I/O using
macromagic.h
2015-08-18 14:40:08 +01:00
Peter Boyle
f0e32f12cf Merge branch 'master' of https://github.com/paboyle/Grid 2015-08-15 23:59:04 +01:00
Peter Boyle
55cfc89459 * Finished the template/policy style introduction of gparity, except the gparity force terms.
So valence sector looks ok.

FermionOperatorImpl.h provides the policy classes.

Expect HMC will introduce a smearing policy and a fermion representation change policy template
param. Will also probably need multi-precision work.

* HMC is running even-odd and non-checkerboarded (checked 4^4 wilson fermion/wilson gauge).

There appears to be a bug in the multi-level integrator -- <e-dH> passes with single level but
not with multi-level.

In any case there looks to be quite a bit to clean up.

This is the "const det" style implementation that is not appropriate  yet for clover since
it assumes that Mee is indept of the gauge fields. Easily fixed in future.
2015-08-15 23:25:49 +01:00
Peter Boyle
ba8c09a58e Reorganising the Fermion interface 2015-08-14 14:16:45 +01:00
Peter Boyle
59d66eb17a Gparity works now even if simd distributed in a Gparity twist direction.
Tested by doubling lattice in t-direction.
2015-08-14 12:57:42 +01:00
Peter Boyle
4dc7c36aa8 Gparity works now even if simd distributed in a Gparity twist direction.
Tested by doubling lattice in t-direction.
2015-08-14 12:57:42 +01:00
Peter Boyle
028e2061e0 Gparity valence test now working.
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
7d3512ab21 Gparity valence test now working.
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
8a0be42080 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle
9183380946 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle
26f5ee0621 Header 2015-08-11 06:23:38 +01:00
Peter Boyle
f165b1a120 Header 2015-08-11 06:23:38 +01:00
Peter Boyle
881acaa065 Gparity modifications in the Gparity compressor variant. 2015-08-11 06:22:20 +01:00
Peter Boyle
0a9ebac514 Gparity modifications in the Gparity compressor variant. 2015-08-11 06:22:20 +01:00
Peter Boyle
aeb7442d8f Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
84a66476ab Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
2994274267 Changes making force term test for DWF pass. 2015-08-01 22:06:07 +09:00
Peter Boyle
2157a6919a Changes making force term test for DWF pass. 2015-08-01 22:06:07 +09:00
Peter Boyle
1d0be956ae Jackson smoothed chebyshev and (untested) completion of force terms
for Cayley, Partial and Cont fraction dwf and overlap.
have even odd and unprec forces.
2015-08-01 05:58:35 +09:00
Peter Boyle
1d67d29183 Jackson smoothed chebyshev and (untested) completion of force terms
for Cayley, Partial and Cont fraction dwf and overlap.
have even odd and unprec forces.
2015-08-01 05:58:35 +09:00
Peter Boyle
cc4ca48d13 Two flavour HMC for Wilson/Wilson is conserving energy.
Still to check plaq and <e(-dH)>, but nevertheless this is
progress
2015-07-29 17:53:39 +09:00
Peter Boyle
4fe110bd07 Two flavour HMC for Wilson/Wilson is conserving energy.
Still to check plaq and <e(-dH)>, but nevertheless this is
progress
2015-07-29 17:53:39 +09:00
Peter Boyle
bc09d7c3bd Committing incomplete work for parameter file I/O.
MacroMagic.h is central. Guido and I plan to move
over to generating virtual (XML, JSON, YAML, text, binary) encoding
from macro based system.
2015-07-27 18:32:28 +09:00
Peter Boyle
4cc2ef84d3 Committing incomplete work for parameter file I/O.
MacroMagic.h is central. Guido and I plan to move
over to generating virtual (XML, JSON, YAML, text, binary) encoding
from macro based system.
2015-07-27 18:32:28 +09:00
Peter Boyle
d7e6b65a76 Elemental force term for Wilson dslash added and tests thereof passing.
Now need to construct pseudofermion two flavour, ratio, one flavour, ratio
action fragments.
2015-07-26 10:54:38 +09:00
Peter Boyle
d9d4c5916a Elemental force term for Wilson dslash added and tests thereof passing.
Now need to construct pseudofermion two flavour, ratio, one flavour, ratio
action fragments.
2015-07-26 10:54:38 +09:00
Peter Boyle
28bdc90908 Sizable improvement in multigrid for unsquared.
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01

Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
d1afebf71e Sizable improvement in multigrid for unsquared.
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01

Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
8925845684 Merge 2015-07-21 13:56:22 +09:00
Peter Boyle
4e94ddad46 Merge 2015-07-21 13:56:22 +09:00
Peter Boyle
c7925e5c9b Small pretty layout change 2015-07-21 13:53:23 +09:00
Peter Boyle
8d654a86de Small pretty layout change 2015-07-21 13:53:23 +09:00
Peter Boyle
f41c7dffef Big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:17:27 +01:00
Peter Boyle
03ca506a3d Big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:17:27 +01:00
Peter Boyle
74e397b29c big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
98c817df1b big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
dec68e5c0e Some small steps towards a multigrid 2015-06-22 12:49:44 +01:00
Peter Boyle
a17684ebe2 Some small steps towards a multigrid 2015-06-22 12:49:44 +01:00
Azusa Yamaguchi
6cebd006d4 Merge branch 'master' of https://github.com/paboyle/Grid 2015-06-20 14:22:29 +01:00
Azusa Yamaguchi
dc7c77e1d5 Merge branch 'master' of https://github.com/paboyle/Grid 2015-06-20 14:22:29 +01:00
Peter Boyle
6abbd35d81 5d OpDir direction interface refers to the 5d dims, not 4d to present a
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
b92060f511 5d OpDir direction interface refers to the 5d dims, not 4d to present a
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
c7152c520a g5 and g5R5 hermitian are now differentiated 2015-06-09 22:40:58 +01:00
Peter Boyle
c133974d67 g5 and g5R5 hermitian are now differentiated 2015-06-09 22:40:58 +01:00
Peter Boyle
506dfd1517 Some unary ops and coarse grid support 2015-06-09 10:26:19 +01:00
Peter Boyle
1e5b015ee3 Some unary ops and coarse grid support 2015-06-09 10:26:19 +01:00
Peter Boyle
9e7035f5dc Conjugate residual algorithm; some more unary functions 2015-06-08 12:04:59 +01:00
Peter Boyle
d6f1ddf99c Conjugate residual algorithm; some more unary functions 2015-06-08 12:04:59 +01:00
Peter Boyle
50e8b2160e Conjugate residual added 2015-06-05 18:16:25 +01:00
Peter Boyle
1a05882d7c Conjugate residual added 2015-06-05 18:16:25 +01:00
Peter Boyle
b9e9777912 PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
and hermiticity tests.
2015-06-04 13:28:37 +01:00
Peter Boyle
63a61fcc2a PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
and hermiticity tests.
2015-06-04 13:28:37 +01:00
neo
b9edadc53e Addedd Ta functionality to the tensor types
Merge remote-tracking branch 'upstream/master'

Conflicts:
	configure
2015-06-04 18:11:32 +09:00
neo
3055d2cf2c Addedd Ta functionality to the tensor types
Merge remote-tracking branch 'upstream/master'

Conflicts:
	configure
2015-06-04 18:11:32 +09:00
Peter Boyle
37aa74dfd2 CG Tests work for wilson kernel cont frac zolo and tanh 2015-06-04 06:02:00 +01:00
Peter Boyle
dd1f5dd966 CG Tests work for wilson kernel cont frac zolo and tanh 2015-06-04 06:02:00 +01:00
Peter Boyle
c327019574 Implementing the Hw kernel continued fraction 5d overlap cases 2015-06-04 00:23:16 +01:00
Peter Boyle
a088a65656 Implementing the Hw kernel continued fraction 5d overlap cases 2015-06-04 00:23:16 +01:00
Peter Boyle
50bd293527 First pass at continued fraction; solver and even odd decomposition tests pass.
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
03f4fde468 First pass at continued fraction; solver and even odd decomposition tests pass.
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
4bcc319e11 Reorganise of file naming 2015-06-03 12:47:05 +01:00
Peter Boyle
1d0df449e8 Reorganise of file naming 2015-06-03 12:47:05 +01:00
Peter Boyle
8fe3d4f971 Overlap Wilson Cayley tanh & zolo 2015-06-03 11:26:54 +01:00
Peter Boyle
a3b599ae30 Overlap Wilson Cayley tanh & zolo 2015-06-03 11:26:54 +01:00
Peter Boyle
343d039b37 Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius. 2015-06-03 09:51:06 +01:00
Peter Boyle
260011670e Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius. 2015-06-03 09:51:06 +01:00
Peter Boyle
5916386242 Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
1fcacef239 Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
2583570e17 Domain wall fermions now invert ; have the basis set up for
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx        Representation               Kernel.

All are done with space-time taking part in checkerboarding, Ls uncheckerboarded

Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i)  Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.

That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
3845f267cb Domain wall fermions now invert ; have the basis set up for
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx        Representation               Kernel.

All are done with space-time taking part in checkerboarding, Ls uncheckerboarded

Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i)  Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.

That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
a75b6f6e78 Large scale change to support 5d fermion formulations.
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
5644ab1e19 Large scale change to support 5d fermion formulations.
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00