Azusa Yamaguchi
ae0873bc77
First cut at SUN support for quenched updates
2015-06-14 01:28:54 +01:00
Azusa Yamaguchi
87cc0d4ca3
Peek poke colour/spin/complex and trace transpose support
2015-06-14 01:00:11 +01:00
Azusa Yamaguchi
66e5718610
const safety
2015-06-14 00:59:50 +01:00
Peter Boyle
6abbd35d81
5d OpDir direction interface refers to the 5d dims, not 4d to present a
...
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
c7152c520a
g5 and g5R5 hermitian are now differentiated
2015-06-09 22:40:58 +01:00
Peter Boyle
d8ddec86f7
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Make.inc
2015-06-09 10:27:10 +01:00
Peter Boyle
506dfd1517
Some unary ops and coarse grid support
2015-06-09 10:26:19 +01:00
neo
6b8fe04054
Experimental support for ARM
2015-06-09 15:46:21 +09:00
Peter Boyle
9e7035f5dc
Conjugate residual algorithm; some more unary functions
2015-06-08 12:04:59 +01:00
Peter Boyle
50e8b2160e
Conjugate residual added
2015-06-05 18:16:25 +01:00
Azusa Yamaguchi
ad18df92d0
Compile fix
2015-06-05 10:29:42 +01:00
Azusa Yamaguchi
1d7f9567ee
Endif terminated
2015-06-05 10:19:42 +01:00
Azusa Yamaguchi
a8b86e747b
merge to the head
2015-06-05 10:15:31 +01:00
Azusa Yamaguchi
c05fe2706c
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-05 10:04:46 +01:00
Azusa Yamaguchi
58cdcbb5e4
Adding some wilson loop support
2015-06-05 10:02:36 +01:00
Peter Boyle
b9e9777912
PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
...
and hermiticity tests.
2015-06-04 13:28:37 +01:00
neo
b9edadc53e
Addedd Ta functionality to the tensor types
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
configure
2015-06-04 18:11:32 +09:00
Peter Boyle
37aa74dfd2
CG Tests work for wilson kernel cont frac zolo and tanh
2015-06-04 06:02:00 +01:00
Peter Boyle
c327019574
Implementing the Hw kernel continued fraction 5d overlap cases
2015-06-04 00:23:16 +01:00
Peter Boyle
50bd293527
First pass at continued fraction; solver and even odd decomposition tests pass.
...
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
4bcc319e11
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
8fe3d4f971
Overlap Wilson Cayley tanh & zolo
2015-06-03 11:26:54 +01:00
Peter Boyle
343d039b37
Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius.
2015-06-03 09:51:06 +01:00
Peter Boyle
5916386242
Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
...
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
2583570e17
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
a75b6f6e78
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
5e72e4c0d9
Strip out the dslash kernel implementation
2015-05-26 19:55:18 +01:00
Peter Boyle
a32ac287bb
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
1a9841a0f1
Better EO support letting Schur solver work
2015-05-25 13:46:28 +01:00
Peter Boyle
65f2e6b269
Improving even odd sector; lot of work and through required cleaning this
2015-05-23 09:34:16 +01:00
Peter Boyle
46ab8edf30
Optimisation...
2015-05-19 15:50:47 +01:00
Peter Boyle
ffc00caea3
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
neo
cee363e28c
Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.
2015-05-18 16:48:14 +09:00
Peter Boyle
d0e4673a3f
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
dc6b6bdc96
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
Peter Boyle
0e7945fe54
Forces inlining upon icpc
2015-05-15 11:43:49 +01:00
Peter Boyle
0097b81778
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
541d52ab97
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
c6baa3e657
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
paboyle
b42453d1fd
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
2203c6e597
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
4da2c2ea00
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
dc7132af71
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
52403d587c
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
cdd5cdeda2
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
c0ead94791
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
dcc23faa4a
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
b0485894b3
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
0b7d389258
Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required
2015-04-27 13:45:07 +01:00
Peter Boyle
35cfef2129
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00