Peter Boyle
5e370db6c5
Sizable improvement in multigrid for unsquared.
...
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01
Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
28bdc90908
Sizable improvement in multigrid for unsquared.
...
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01
Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
d1afebf71e
Sizable improvement in multigrid for unsquared.
...
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01
Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
neo
479912a5ed
Merge remote-tracking branch 'upstream/master'
2015-07-21 17:17:50 +09:00
neo
d01310383f
Merge remote-tracking branch 'upstream/master'
2015-07-21 17:17:50 +09:00
neo
5fc6af1c77
Merge remote-tracking branch 'upstream/master'
2015-07-21 17:17:50 +09:00
Peter Boyle
987801c86d
Merge
2015-07-21 13:56:22 +09:00
Peter Boyle
8925845684
Merge
2015-07-21 13:56:22 +09:00
Peter Boyle
4e94ddad46
Merge
2015-07-21 13:56:22 +09:00
Peter Boyle
9651ab661b
Small pretty layout change
2015-07-21 13:53:23 +09:00
Peter Boyle
c7925e5c9b
Small pretty layout change
2015-07-21 13:53:23 +09:00
Peter Boyle
8d654a86de
Small pretty layout change
2015-07-21 13:53:23 +09:00
Peter Boyle
fb65953d82
This was needed to compile on gcc
2015-07-21 13:52:59 +09:00
Peter Boyle
9d18773fbc
This was needed to compile on gcc
2015-07-21 13:52:59 +09:00
Peter Boyle
df2aac01f4
This was needed to compile on gcc
2015-07-21 13:52:59 +09:00
neo
ab916d80fd
More NEON functionalities
2015-07-21 11:52:15 +09:00
neo
7343a95772
More NEON functionalities
2015-07-21 11:52:15 +09:00
neo
9adaeb061a
More NEON functionalities
2015-07-21 11:52:15 +09:00
neo
c431816393
Cleaning up files for HMC
2015-07-07 14:59:37 +09:00
neo
48ae886c32
Cleaning up files for HMC
2015-07-07 14:59:37 +09:00
neo
97afe4125f
Cleaning up files for HMC
2015-07-07 14:59:37 +09:00
neo
1e9317e5cf
Simplifying HMC syntax for the final user
2015-07-06 18:32:20 +09:00
neo
19a1ffedcc
Simplifying HMC syntax for the final user
2015-07-06 18:32:20 +09:00
neo
0f21c38ff8
Simplifying HMC syntax for the final user
2015-07-06 18:32:20 +09:00
neo
510f55ba30
Rearranging files in hmc
2015-07-06 16:46:43 +09:00
neo
32e6887d5f
Rearranging files in hmc
2015-07-06 16:46:43 +09:00
neo
fa42b652e5
Rearranging files in hmc
2015-07-06 16:46:43 +09:00
neo
f95db88d19
Added minimum norm integrator
...
Little rearrangement of HMC and integrator classes
2015-07-06 16:17:32 +09:00
neo
1991852025
Added minimum norm integrator
...
Little rearrangement of HMC and integrator classes
2015-07-06 16:17:32 +09:00
neo
68fe0769a1
Added minimum norm integrator
...
Little rearrangement of HMC and integrator classes
2015-07-06 16:17:32 +09:00
neo
12e1682a87
HMC for Wilson Gauge action works
...
Fixed bug in momenta generation
2015-07-06 12:58:49 +09:00
neo
2718038977
HMC for Wilson Gauge action works
...
Fixed bug in momenta generation
2015-07-06 12:58:49 +09:00
neo
808f5820fa
HMC for Wilson Gauge action works
...
Fixed bug in momenta generation
2015-07-06 12:58:49 +09:00
neo
6261770f59
Debugged vector version of ProjectOnGroup
2015-07-06 02:24:58 +09:00
neo
62d8952c0a
Debugged vector version of ProjectOnGroup
2015-07-06 02:24:58 +09:00
neo
0ffcdf6204
Debugged vector version of ProjectOnGroup
2015-07-06 02:24:58 +09:00
neo
7a4ed7a867
HMC ready but untested
2015-07-04 17:47:50 +09:00
neo
b1f94fa292
HMC ready but untested
2015-07-04 17:47:50 +09:00
neo
e6087e1820
HMC ready but untested
2015-07-04 17:47:50 +09:00
neo
250965c6ca
More progress in the HMC construction
2015-07-04 02:43:14 +09:00
neo
30c9dc473d
More progress in the HMC construction
2015-07-04 02:43:14 +09:00
neo
59be55c0ab
More progress in the HMC construction
2015-07-04 02:43:14 +09:00
neo
55f05a778f
Skeleton of HMC/Integrators
2015-07-03 16:51:41 +09:00
neo
9655d43017
Skeleton of HMC/Integrators
2015-07-03 16:51:41 +09:00
neo
ab3ad78ece
Skeleton of HMC/Integrators
2015-07-03 16:51:41 +09:00
Peter Boyle
e164ed6f12
Big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:17:27 +01:00
Peter Boyle
f41c7dffef
Big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:17:27 +01:00
Peter Boyle
03ca506a3d
Big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:17:27 +01:00
Peter Boyle
95ecf81d42
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
74e397b29c
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
98c817df1b
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:03:11 +01:00
Peter Boyle
a4369e1db6
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:00:19 +01:00
Peter Boyle
7de5ccb879
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:00:19 +01:00
Peter Boyle
c20fdd45a5
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:00:19 +01:00
Peter Boyle
5f8f0bc792
Some small steps towards a multigrid
2015-06-22 12:49:44 +01:00
Peter Boyle
dec68e5c0e
Some small steps towards a multigrid
2015-06-22 12:49:44 +01:00
Peter Boyle
a17684ebe2
Some small steps towards a multigrid
2015-06-22 12:49:44 +01:00
Azusa Yamaguchi
a265765319
Variable preconditioned GCR with restarting.
...
Orthogonalisation depth and restart frequency is controllable via constructor
2015-06-21 10:58:46 +01:00
Azusa Yamaguchi
945bb93e48
Variable preconditioned GCR with restarting.
...
Orthogonalisation depth and restart frequency is controllable via constructor
2015-06-21 10:58:46 +01:00
Azusa Yamaguchi
3b4118f33e
Variable preconditioned GCR with restarting.
...
Orthogonalisation depth and restart frequency is controllable via constructor
2015-06-21 10:58:46 +01:00
Peter Boyle
eace9051e8
Merge
...
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-20 22:25:31 +01:00
Peter Boyle
bcf1d5160f
Merge
...
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-20 22:25:31 +01:00
Peter Boyle
c7d77dfa0f
Merge
...
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-20 22:25:31 +01:00
Peter Boyle
aba5c8595a
Patches for beginnings of an overlap multigrid
2015-06-20 22:22:56 +01:00
Peter Boyle
6ad96f7383
Patches for beginnings of an overlap multigrid
2015-06-20 22:22:56 +01:00
Peter Boyle
b4a6dbfa65
Patches for beginnings of an overlap multigrid
2015-06-20 22:22:56 +01:00
Azusa Yamaguchi
cb92390825
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-20 14:22:29 +01:00
Azusa Yamaguchi
6cebd006d4
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-20 14:22:29 +01:00
Azusa Yamaguchi
dc7c77e1d5
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-20 14:22:29 +01:00
Azusa Yamaguchi
74845cb3dc
Quenched works for wilson gauge
2015-06-16 14:17:11 +01:00
Azusa Yamaguchi
79a9f8b9c9
Quenched works for wilson gauge
2015-06-16 14:17:11 +01:00
Azusa Yamaguchi
18d0437f8d
Quenched works for wilson gauge
2015-06-16 14:17:11 +01:00
Azusa Yamaguchi
c945041067
uninitialised bug fix
2015-06-16 14:07:05 +01:00
Azusa Yamaguchi
173b31ce05
uninitialised bug fix
2015-06-16 14:07:05 +01:00
Azusa Yamaguchi
4e7300b68d
uninitialised bug fix
2015-06-16 14:07:05 +01:00
Azusa Yamaguchi
54964dd4bb
First cut at SUN support for quenched updates
2015-06-14 01:28:54 +01:00
Azusa Yamaguchi
ae0873bc77
First cut at SUN support for quenched updates
2015-06-14 01:28:54 +01:00
Azusa Yamaguchi
55d7483608
First cut at SUN support for quenched updates
2015-06-14 01:28:54 +01:00
Azusa Yamaguchi
e529dd9696
Peek poke colour/spin/complex and trace transpose support
2015-06-14 01:00:11 +01:00
Azusa Yamaguchi
87cc0d4ca3
Peek poke colour/spin/complex and trace transpose support
2015-06-14 01:00:11 +01:00
Azusa Yamaguchi
6b8bdf0c6b
Peek poke colour/spin/complex and trace transpose support
2015-06-14 01:00:11 +01:00
Azusa Yamaguchi
610450bc0e
const safety
2015-06-14 00:59:50 +01:00
Azusa Yamaguchi
66e5718610
const safety
2015-06-14 00:59:50 +01:00
Azusa Yamaguchi
68b82ddd99
const safety
2015-06-14 00:59:50 +01:00
Peter Boyle
4963f7356a
5d OpDir direction interface refers to the 5d dims, not 4d to present a
...
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
6abbd35d81
5d OpDir direction interface refers to the 5d dims, not 4d to present a
...
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
b92060f511
5d OpDir direction interface refers to the 5d dims, not 4d to present a
...
sensible and consistent external interface.
2015-06-09 22:41:59 +01:00
Peter Boyle
708d4f7533
g5 and g5R5 hermitian are now differentiated
2015-06-09 22:40:58 +01:00
Peter Boyle
c7152c520a
g5 and g5R5 hermitian are now differentiated
2015-06-09 22:40:58 +01:00
Peter Boyle
c133974d67
g5 and g5R5 hermitian are now differentiated
2015-06-09 22:40:58 +01:00
Peter Boyle
e8b43944e7
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Make.inc
2015-06-09 10:27:10 +01:00
Peter Boyle
d8ddec86f7
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Make.inc
2015-06-09 10:27:10 +01:00
Peter Boyle
a73a1c1bc1
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Make.inc
2015-06-09 10:27:10 +01:00
Peter Boyle
1048304f30
Some unary ops and coarse grid support
2015-06-09 10:26:19 +01:00
Peter Boyle
506dfd1517
Some unary ops and coarse grid support
2015-06-09 10:26:19 +01:00
Peter Boyle
1e5b015ee3
Some unary ops and coarse grid support
2015-06-09 10:26:19 +01:00
neo
744ac33e8b
Experimental support for ARM
2015-06-09 15:46:21 +09:00
neo
6b8fe04054
Experimental support for ARM
2015-06-09 15:46:21 +09:00
neo
48bf4878c1
Experimental support for ARM
2015-06-09 15:46:21 +09:00
Peter Boyle
b0873e7ed2
Conjugate residual algorithm; some more unary functions
2015-06-08 12:04:59 +01:00
Peter Boyle
9e7035f5dc
Conjugate residual algorithm; some more unary functions
2015-06-08 12:04:59 +01:00
Peter Boyle
d6f1ddf99c
Conjugate residual algorithm; some more unary functions
2015-06-08 12:04:59 +01:00
Peter Boyle
a263e78f8d
Conjugate residual added
2015-06-05 18:16:25 +01:00
Peter Boyle
50e8b2160e
Conjugate residual added
2015-06-05 18:16:25 +01:00
Peter Boyle
1a05882d7c
Conjugate residual added
2015-06-05 18:16:25 +01:00
Azusa Yamaguchi
5f33cc3a95
Compile fix
2015-06-05 10:29:42 +01:00
Azusa Yamaguchi
ad18df92d0
Compile fix
2015-06-05 10:29:42 +01:00
Azusa Yamaguchi
351c2905f5
Compile fix
2015-06-05 10:29:42 +01:00
Azusa Yamaguchi
cc5f518b21
Endif terminated
2015-06-05 10:19:42 +01:00
Azusa Yamaguchi
1d7f9567ee
Endif terminated
2015-06-05 10:19:42 +01:00
Azusa Yamaguchi
ee3031c914
Endif terminated
2015-06-05 10:19:42 +01:00
Azusa Yamaguchi
8f9627520b
merge to the head
2015-06-05 10:15:31 +01:00
Azusa Yamaguchi
a8b86e747b
merge to the head
2015-06-05 10:15:31 +01:00
Azusa Yamaguchi
58a4f32298
merge to the head
2015-06-05 10:15:31 +01:00
Azusa Yamaguchi
db84b19443
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-05 10:04:46 +01:00
Azusa Yamaguchi
c05fe2706c
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-05 10:04:46 +01:00
Azusa Yamaguchi
ac504bea6c
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-05 10:04:46 +01:00
Azusa Yamaguchi
7d984b9547
Adding some wilson loop support
2015-06-05 10:02:36 +01:00
Azusa Yamaguchi
58cdcbb5e4
Adding some wilson loop support
2015-06-05 10:02:36 +01:00
Azusa Yamaguchi
94ea84d83f
Adding some wilson loop support
2015-06-05 10:02:36 +01:00
Peter Boyle
7678fbd30d
PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
...
and hermiticity tests.
2015-06-04 13:28:37 +01:00
Peter Boyle
b9e9777912
PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
...
and hermiticity tests.
2015-06-04 13:28:37 +01:00
Peter Boyle
63a61fcc2a
PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
...
and hermiticity tests.
2015-06-04 13:28:37 +01:00
neo
bb73569fd6
Addedd Ta functionality to the tensor types
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
configure
2015-06-04 18:11:32 +09:00
neo
b9edadc53e
Addedd Ta functionality to the tensor types
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
configure
2015-06-04 18:11:32 +09:00
neo
3055d2cf2c
Addedd Ta functionality to the tensor types
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
configure
2015-06-04 18:11:32 +09:00
Peter Boyle
9c1ab656d4
CG Tests work for wilson kernel cont frac zolo and tanh
2015-06-04 06:02:00 +01:00
Peter Boyle
37aa74dfd2
CG Tests work for wilson kernel cont frac zolo and tanh
2015-06-04 06:02:00 +01:00
Peter Boyle
dd1f5dd966
CG Tests work for wilson kernel cont frac zolo and tanh
2015-06-04 06:02:00 +01:00
Peter Boyle
1ad689e4d5
Implementing the Hw kernel continued fraction 5d overlap cases
2015-06-04 00:23:16 +01:00
Peter Boyle
c327019574
Implementing the Hw kernel continued fraction 5d overlap cases
2015-06-04 00:23:16 +01:00
Peter Boyle
a088a65656
Implementing the Hw kernel continued fraction 5d overlap cases
2015-06-04 00:23:16 +01:00
Peter Boyle
802e94e9ca
First pass at continued fraction; solver and even odd decomposition tests pass.
...
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
50bd293527
First pass at continued fraction; solver and even odd decomposition tests pass.
...
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
03f4fde468
First pass at continued fraction; solver and even odd decomposition tests pass.
...
Have to make ContFrac class virtual and derive end non-abstract actions for the particular
cases.
2015-06-04 00:00:45 +01:00
Peter Boyle
f9b070d64d
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
4bcc319e11
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
6cb38dc5dc
Overlap Wilson Cayley tanh & zolo
2015-06-03 11:26:54 +01:00
Peter Boyle
8fe3d4f971
Overlap Wilson Cayley tanh & zolo
2015-06-03 11:26:54 +01:00
Peter Boyle
a3b599ae30
Overlap Wilson Cayley tanh & zolo
2015-06-03 11:26:54 +01:00
Peter Boyle
c659c76053
Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius.
2015-06-03 09:51:06 +01:00
Peter Boyle
343d039b37
Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius.
2015-06-03 09:51:06 +01:00
Peter Boyle
260011670e
Scaled Shamir and Scaled Shamir Zolotarev aliases for special cases of Mobius.
2015-06-03 09:51:06 +01:00
Peter Boyle
68e26140ee
Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
...
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
5916386242
Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
...
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
1fcacef239
Mobius Caley form, Mobius Zolotarev operators. Pass Even Odd vs unprec test and hermiticity checks
...
in tests/Grid_any_evenodd.cc; will work on inversion tests shortly.
2015-06-03 09:36:26 +01:00
Peter Boyle
0bc004de7c
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
2583570e17
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
3845f267cb
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
66d997e031
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
a75b6f6e78
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
6ef0096dc9
Strip out the dslash kernel implementation
2015-05-26 19:55:18 +01:00
Peter Boyle
5e72e4c0d9
Strip out the dslash kernel implementation
2015-05-26 19:55:18 +01:00
Peter Boyle
bfb1cd36e2
Strip out the dslash kernel implementation
2015-05-26 19:55:18 +01:00
Peter Boyle
20100d0a40
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
a32ac287bb
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
201a110c51
Better EO support letting Schur solver work
2015-05-25 13:46:28 +01:00
Peter Boyle
1a9841a0f1
Better EO support letting Schur solver work
2015-05-25 13:46:28 +01:00
Peter Boyle
ea3240ad55
Better EO support letting Schur solver work
2015-05-25 13:46:28 +01:00
Peter Boyle
2d30e82dcb
Improving even odd sector; lot of work and through required cleaning this
2015-05-23 09:34:16 +01:00
Peter Boyle
65f2e6b269
Improving even odd sector; lot of work and through required cleaning this
2015-05-23 09:34:16 +01:00
Peter Boyle
64fcbd0387
Improving even odd sector; lot of work and through required cleaning this
2015-05-23 09:34:16 +01:00
Peter Boyle
2d8b5a8191
Optimisation...
2015-05-19 15:50:47 +01:00
Peter Boyle
46ab8edf30
Optimisation...
2015-05-19 15:50:47 +01:00
Peter Boyle
8220794c44
Optimisation...
2015-05-19 15:50:47 +01:00
Peter Boyle
a6e1ea216d
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
ffc00caea3
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
4dba8522a1
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
neo
6d2accba7b
Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.
2015-05-18 16:48:14 +09:00
neo
cee363e28c
Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.
2015-05-18 16:48:14 +09:00
neo
b4cd37276b
Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.
2015-05-18 16:48:14 +09:00
Peter Boyle
1887c77498
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
d0e4673a3f
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
11cb3e9a01
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
e841395dfd
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
Peter Boyle
dc6b6bdc96
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
Peter Boyle
bf7ab0da7a
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
Peter Boyle
e3b61bdfce
Forces inlining upon icpc
2015-05-15 11:43:49 +01:00
Peter Boyle
0e7945fe54
Forces inlining upon icpc
2015-05-15 11:43:49 +01:00
Peter Boyle
a0d041b522
Forces inlining upon icpc
2015-05-15 11:43:49 +01:00
Peter Boyle
7f3ae64a31
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
0097b81778
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
e179828662
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
b4a570477c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
541d52ab97
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
48f425d31c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
65c91eae64
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
c6baa3e657
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
6103c29ee3
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
paboyle
fa5779537c
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
paboyle
b42453d1fd
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
paboyle
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
242e447bc5
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
2203c6e597
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
5555a852be
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
352bccf6ca
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
4da2c2ea00
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
48b9692845
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
133493dc79
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
dc7132af71
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
02ae26d091
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
5415180676
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
52403d587c
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
55ccb8ccf4
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
7b0dd6c5d6
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
cdd5cdeda2
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
35d949cc17
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
c0ead94791
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
a98c01c86a
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
dcc23faa4a
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
c72db6c6f6
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
b0485894b3
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
25d523c0f4
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
0b7d389258
Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required
2015-04-27 13:45:07 +01:00
Peter Boyle
f159495a9d
Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required
2015-04-27 13:45:07 +01:00
Peter Boyle
35cfef2129
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
c678f2d255
Starting the implementation of wilson; incomplete and committing non-functional code which
...
is not yet included from elsewhere or linked to the build system.
2015-04-25 14:33:02 +01:00
Peter Boyle
51f0da7b93
Starting the implementation of wilson; incomplete and committing non-functional code which
...
is not yet included from elsewhere or linked to the build system.
2015-04-25 14:33:02 +01:00
Peter Boyle
2d8cf9e456
Added two spinor functionality required to support the Wilson hopping term.
2015-04-25 12:54:06 +01:00
Peter Boyle
c5fa18eb20
Added two spinor functionality required to support the Wilson hopping term.
2015-04-25 12:54:06 +01:00
Peter Boyle
fc32450360
Improved the gamma quite a bit.
...
Serial rng's which are set on node zero and broadcaste
2015-04-24 20:21:40 +01:00
Peter Boyle
9ec3529864
Improved the gamma quite a bit.
...
Serial rng's which are set on node zero and broadcaste
2015-04-24 20:21:40 +01:00
Peter Boyle
2a67214f9d
static names and enum list
2015-04-24 19:12:14 +01:00
Peter Boyle
42eac283e2
static names and enum list
2015-04-24 19:12:14 +01:00
Peter Boyle
71d5927a66
Vectors now too and right multiple of matrix with gamma
2015-04-24 19:08:29 +01:00
Peter Boyle
38598190c3
Vectors now too and right multiple of matrix with gamma
2015-04-24 19:08:29 +01:00
Peter Boyle
74432432b6
Moved code from summation into transfer and reduction
2015-04-24 18:40:44 +01:00
Peter Boyle
128ad0999f
Moved code from summation into transfer and reduction
2015-04-24 18:40:44 +01:00
Peter Boyle
b8eef54fa7
First implementation of Dirac matrices as a Gamma class.
2015-04-24 18:20:03 +01:00
Peter Boyle
d707c4e0a3
First implementation of Dirac matrices as a Gamma class.
2015-04-24 18:20:03 +01:00
Peter Boyle
afe6c4f64f
move
2015-04-23 20:41:22 +01:00
Peter Boyle
898f64cdd7
move
2015-04-23 20:41:22 +01:00
Peter Boyle
1851327d19
Got the NERSC IO working and fixed a bug in cshift.
2015-04-22 22:46:48 +01:00
Peter Boyle
b32c14b433
Got the NERSC IO working and fixed a bug in cshift.
2015-04-22 22:46:48 +01:00
Peter Boyle
aee6669d0b
Build reorg with which I am a bit happier
2015-04-18 21:22:50 +01:00
Peter Boyle
e5a25dfcb1
Build reorg with which I am a bit happier
2015-04-18 21:22:50 +01:00