1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-24 17:54:47 +01:00
Commit Graph

205 Commits

Author SHA1 Message Date
Peter Boyle
b3d70a3bb2 Ncall change 2015-11-04 09:55:21 +00:00
Peter Boyle
c26220e9ab EO benchmark as well as non-eo 2015-11-04 09:54:48 +00:00
Peter Boyle
3726fe7481 Bigger vec length 2015-10-09 00:42:54 +02:00
paboyle
af89c40462 Better timing tweaks to give sensible results on 24 threads on Edison dual ivybridge nodes. 2015-09-28 16:09:04 -07:00
Peter Boyle
9f4f65cb46 Added a decoupled memory system benchmark to remove thread synch overhead 2015-09-26 18:23:57 -07:00
Peter Boyle
9183380946 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle
84a66476ab Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
d1afebf71e Sizable improvement in multigrid for unsquared.
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01

Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
31a0c8d783 Merge branch 'master' of https://github.com/paboyle/Grid 2015-07-01 22:51:04 +01:00
paboyle
39271b02dd Modified memory bw test to display word size 2015-07-01 22:46:53 +01:00
Peter Boyle
638d2cda11 Change the SIMD command correctly with precision = double vs. single and
connect the "Real" default precisoin to a configure flag.
Have RealF, RealD and Real types, where Real is compile target dependent single/double,
RealF is single and RealD is double etc..
2015-07-01 22:45:15 +01:00
Peter Boyle
9143f071d7 Merge branch 'master' of https://github.com/paboyle/Grid 2015-06-30 15:17:46 +01:00
Peter Boyle
8ad81bed32 big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:01:44 +01:00
Peter Boyle
93916f400d Update Benchmark_comms.cc 2015-06-25 10:59:53 +01:00
Peter Boyle
f07a17ba2c Assist for generating file lists contained in Make.inc files for convenience when things are added 2015-06-03 13:07:00 +01:00
Peter Boyle
84b5c7217d CG test written and passes i.e. converges with small true residual
in RedBlack MpcDagMpc, Unprec MdagM and Schur red black solver for
each of.

DomainWallFermion
MobiusFermion
MobiusZolotarevFermion
ScaledShamirFermion
ScaledShamirZolotarevFermion
2015-06-03 10:54:03 +01:00
Peter Boyle
69f4d58381 Reorg; moving prec/unprec/schur CG for Wilson and DWF into tests as these are really tests and not benchmarks
(no performance reports, only convergence test).
2015-06-02 17:25:26 +01:00
Peter Boyle
3845f267cb Domain wall fermions now invert ; have the basis set up for
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx        Representation               Kernel.

All are done with space-time taking part in checkerboarding, Ls uncheckerboarded

Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i)  Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.

That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
5644ab1e19 Large scale change to support 5d fermion formulations.
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
59db857ad1 Integer wrap problem fixed. 2015-05-29 14:11:34 +01:00
Peter Boyle
67fa5691e5 Weak scale the benchmarks automatically. 2015-05-28 13:47:01 +01:00
Peter Boyle
840754dd42 Hand unrolled version of dslash in a separate class.
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
                   on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
37721572e7 Makefile update 2015-05-25 14:43:08 +01:00
Peter Boyle
489b1b9633 Schur complement based red-black inversion working 2015-05-25 13:47:12 +01:00
Peter Boyle
613a73b1b6 Added 2015-05-23 09:36:01 +01:00
Peter Boyle
22bfbd0f8d adding two routines containing only a single operation so I can easily see the assembly dump 2015-05-21 06:37:46 +01:00
Peter Boyle
3a441c3e94 Minor change 2015-05-21 06:37:20 +01:00
Peter Boyle
d4ca8647dc useful to dump assembler 2015-05-21 06:36:47 +01:00
Peter Boyle
341096dce8 better comms benchmarking 2015-05-21 06:35:46 +01:00
Peter Boyle
d3931111fb Build a simple kernel to compare intel compiler and clang in simple environment 2015-05-19 21:29:40 +01:00
Peter Boyle
a21036e69a Reworking to keep intel compiler happy 2015-05-19 21:29:07 +01:00
azusayamaguchi
2d2da8364f Merge branch 'master' of https://github.com/paboyle/Grid 2015-05-19 14:55:26 +01:00
azusayamaguchi
91f29d4a68 Add messages to get the number of threads for openmp 2015-05-19 14:54:42 +01:00
Peter Boyle
4dba8522a1 Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
bf7ab0da7a Updating preparing for solvers etc.. 2015-05-16 23:35:08 +01:00
Peter Boyle
aff5254208 more digits 2015-05-16 04:33:40 +01:00
Peter Boyle
b4b70702fd Added su3 matrix benchmark. 2015-05-15 14:41:19 +01:00
Peter Boyle
331f832c34 Out of source compile now working 2015-05-15 12:21:40 +01:00
Peter Boyle
ed8e3b676f Remove debug masking 2015-05-15 11:51:15 +01:00
Peter Boyle
e179828662 OMP dslash working 2015-05-13 10:59:22 +01:00
Peter Boyle
48f425d31c I have made the Cshift work successfully with open mp threading in
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
6cec662ac5 Enhanced SIMD interfacing 2015-05-12 20:41:44 +01:00
Peter Boyle
22d384b07d Adding a better controlled threading class, preparing to
force in deterministic reduction.
2015-05-11 18:59:03 +01:00
Peter Boyle
f5dcca7b1b Got command line args working 2015-05-11 14:36:48 +01:00
paboyle
379943abf5 Command line args and a general clean up 2015-05-11 12:43:10 +01:00
Peter Boyle
48b9692845 Merge branch 'master' of https://github.com/paboyle/Grid
Conflicts:
	lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
b2e0f72a7e ET ready benchmark with bytes counted assuming loop interchange 2015-05-10 15:18:04 +01:00
Peter Boyle
55ccb8ccf4 Wilson perf improvements with Gauge prefetching 2015-05-06 06:37:21 +01:00
Peter Boyle
35d949cc17 Cleaned up for Linux 2015-05-05 22:09:22 +01:00
Peter Boyle
bf60764e4b Updated bandwidth test 2015-05-05 18:08:53 +01:00