Azusa Yamaguchi
6b8bdf0c6b
Peek poke colour/spin/complex and trace transpose support
2015-06-14 01:00:11 +01:00
Azusa Yamaguchi
68b82ddd99
const safety
2015-06-14 00:59:50 +01:00
Peter Boyle
b92060f511
5d OpDir direction interface refers to the 5d dims, not 4d to present a
...
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
c133974d67
g5 and g5R5 hermitian are now differentiated
2015-06-09 22:40:58 +01:00
Peter Boyle
a73a1c1bc1
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Make.inc
2015-06-09 10:27:10 +01:00
Peter Boyle
1e5b015ee3
Some unary ops and coarse grid support
2015-06-09 10:26:19 +01:00
neo
48bf4878c1
Experimental support for ARM
2015-06-09 15:46:21 +09:00
Peter Boyle
d6f1ddf99c
Conjugate residual algorithm; some more unary functions
2015-06-08 12:04:59 +01:00
Peter Boyle
1a05882d7c
Conjugate residual added
2015-06-05 18:16:25 +01:00
Azusa Yamaguchi
351c2905f5
Compile fix
2015-06-05 10:29:42 +01:00
Azusa Yamaguchi
ee3031c914
Endif terminated
2015-06-05 10:19:42 +01:00
Azusa Yamaguchi
58a4f32298
merge to the head
2015-06-05 10:15:31 +01:00
Azusa Yamaguchi
ac504bea6c
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-05 10:04:46 +01:00
Azusa Yamaguchi
94ea84d83f
Adding some wilson loop support
2015-06-05 10:02:36 +01:00
Peter Boyle
63a61fcc2a
PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
...
and hermiticity tests.
2015-06-04 13:28:37 +01:00
neo
3055d2cf2c
Addedd Ta functionality to the tensor types
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
configure
2015-06-04 18:11:32 +09:00
Peter Boyle
dd1f5dd966
CG Tests work for wilson kernel cont frac zolo and tanh
2015-06-04 06:02:00 +01:00
Peter Boyle
a088a65656
Implementing the Hw kernel continued fraction 5d overlap cases
2015-06-04 00:23:16 +01:00
Peter Boyle
03f4fde468
First pass at continued fraction; solver and even odd decomposition tests pass.
...
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
a3b599ae30
Overlap Wilson Cayley tanh & zolo
2015-06-03 11:26:54 +01:00
Peter Boyle
260011670e
Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius.
2015-06-03 09:51:06 +01:00
Peter Boyle
1fcacef239
Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
...
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
3845f267cb
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
bfb1cd36e2
Strip out the dslash kernel implementation
2015-05-26 19:55:18 +01:00
Peter Boyle
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
ea3240ad55
Better EO support letting Schur solver work
2015-05-25 13:46:28 +01:00
Peter Boyle
64fcbd0387
Improving even odd sector; lot of work and through required cleaning this
2015-05-23 09:34:16 +01:00
Peter Boyle
8220794c44
Optimisation...
2015-05-19 15:50:47 +01:00
Peter Boyle
4dba8522a1
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
neo
b4cd37276b
Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.
2015-05-18 16:48:14 +09:00
Peter Boyle
11cb3e9a01
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
bf7ab0da7a
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
Peter Boyle
a0d041b522
Forces inlining upon icpc
2015-05-15 11:43:49 +01:00
Peter Boyle
e179828662
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
48f425d31c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
6103c29ee3
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
paboyle
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
5555a852be
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
48b9692845
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
02ae26d091
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
55ccb8ccf4
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
35d949cc17
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
a98c01c86a
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
c72db6c6f6
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
25d523c0f4
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
f159495a9d
Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required
2015-04-27 13:45:07 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
51f0da7b93
Starting the implementation of wilson; incomplete and committing non-functional code which
...
is not yet included from elsewhere or linked to the build system.
2015-04-25 14:33:02 +01:00