84b5c7217d
CG test written and passes i.e. converges with small true residual
...
in RedBlack MpcDagMpc, Unprec MdagM and Schur red black solver for
each of.
DomainWallFermion
MobiusFermion
MobiusZolotarevFermion
ScaledShamirFermion
ScaledShamirZolotarevFermion
2015-06-03 10:54:03 +01:00
69f4d58381
Reorg; moving prec/unprec/schur CG for Wilson and DWF into tests as these are really tests and not benchmarks
...
(no performance reports, only convergence test).
2015-06-02 17:25:26 +01:00
3845f267cb
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
59db857ad1
Integer wrap problem fixed.
2015-05-29 14:11:34 +01:00
67fa5691e5
Weak scale the benchmarks automatically.
2015-05-28 13:47:01 +01:00
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
37721572e7
Makefile update
2015-05-25 14:43:08 +01:00
489b1b9633
Schur complement based red-black inversion working
2015-05-25 13:47:12 +01:00
613a73b1b6
Added
2015-05-23 09:36:01 +01:00
22bfbd0f8d
adding two routines containing only a single operation so I can easily see the assembly dump
2015-05-21 06:37:46 +01:00
3a441c3e94
Minor change
2015-05-21 06:37:20 +01:00
d4ca8647dc
useful to dump assembler
2015-05-21 06:36:47 +01:00
341096dce8
better comms benchmarking
2015-05-21 06:35:46 +01:00
d3931111fb
Build a simple kernel to compare intel compiler and clang in simple environment
2015-05-19 21:29:40 +01:00
a21036e69a
Reworking to keep intel compiler happy
2015-05-19 21:29:07 +01:00
2d2da8364f
Merge branch 'master' of https://github.com/paboyle/Grid
2015-05-19 14:55:26 +01:00
91f29d4a68
Add messages to get the number of threads for openmp
2015-05-19 14:54:42 +01:00
4dba8522a1
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
bf7ab0da7a
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
aff5254208
more digits
2015-05-16 04:33:40 +01:00
b4b70702fd
Added su3 matrix benchmark.
2015-05-15 14:41:19 +01:00
331f832c34
Out of source compile now working
2015-05-15 12:21:40 +01:00
ed8e3b676f
Remove debug masking
2015-05-15 11:51:15 +01:00
e179828662
OMP dslash working
2015-05-13 10:59:22 +01:00
48f425d31c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
6cec662ac5
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
22d384b07d
Adding a better controlled threading class, preparing to
...
force in deterministic reduction.
2015-05-11 18:59:03 +01:00
f5dcca7b1b
Got command line args working
2015-05-11 14:36:48 +01:00
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
48b9692845
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
b2e0f72a7e
ET ready benchmark with bytes counted assuming loop interchange
2015-05-10 15:18:04 +01:00
55ccb8ccf4
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
35d949cc17
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
bf60764e4b
Updated bandwidth test
2015-05-05 18:08:53 +01:00
890b13dd5b
Added a makefile
2015-05-05 17:56:42 +01:00
193860dbc8
Comms and memory benchmarks added
2015-05-03 09:44:47 +01:00
99a1ff423d
Added a comms benchmark
2015-05-02 23:51:43 +01:00
f663be2a6c
Added a comms benchmark
2015-05-02 23:42:30 +01:00
4a1d4f1b3c
Starting a benchmarking sub dir
2015-05-02 17:52:36 +01:00