Peter Boyle
e618c609fe
Merge branch 'master' of https://github.com/paboyle/Grid
2015-06-30 15:17:46 +01:00
Peter Boyle
8581c05ab3
big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:01:44 +01:00
Peter Boyle
d6c79bbadb
Update Benchmark_comms.cc
2015-06-25 10:59:53 +01:00
Peter Boyle
e68d087010
Assist for generating file lists contained in Make.inc files for convenience when things are added
2015-06-03 13:07:00 +01:00
Peter Boyle
2b083ca987
CG test written and passes i.e. converges with small true residual
...
in RedBlack MpcDagMpc, Unprec MdagM and Schur red black solver for
each of.
DomainWallFermion
MobiusFermion
MobiusZolotarevFermion
ScaledShamirFermion
ScaledShamirZolotarevFermion
2015-06-03 10:54:03 +01:00
Peter Boyle
494d2b8b61
Reorg; moving prec/unprec/schur CG for Wilson and DWF into tests as these are really tests and not benchmarks
...
(no performance reports, only convergence test).
2015-06-02 17:25:26 +01:00
Peter Boyle
0bc004de7c
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
66d997e031
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
8c357dca8b
Integer wrap problem fixed.
2015-05-29 14:11:34 +01:00
Peter Boyle
62dccb3247
Weak scale the benchmarks automatically.
2015-05-28 13:47:01 +01:00
Peter Boyle
20100d0a40
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
c2ffb1a098
Makefile update
2015-05-25 14:43:08 +01:00
Peter Boyle
d7f5172860
Schur complement based red-black inversion working
2015-05-25 13:47:12 +01:00
Peter Boyle
31a40fa37f
Added
2015-05-23 09:36:01 +01:00
Peter Boyle
ac0941be9a
adding two routines containing only a single operation so I can easily see the assembly dump
2015-05-21 06:37:46 +01:00
Peter Boyle
fb159e1cff
Minor change
2015-05-21 06:37:20 +01:00
Peter Boyle
8bc0033326
useful to dump assembler
2015-05-21 06:36:47 +01:00
Peter Boyle
046485a7bb
better comms benchmarking
2015-05-21 06:35:46 +01:00
Peter Boyle
91ed085ca4
Build a simple kernel to compare intel compiler and clang in simple environment
2015-05-19 21:29:40 +01:00
Peter Boyle
efc0d1e0b9
Reworking to keep intel compiler happy
2015-05-19 21:29:07 +01:00
azusayamaguchi
ee8cf77071
Merge branch 'master' of https://github.com/paboyle/Grid
2015-05-19 14:55:26 +01:00
azusayamaguchi
c8c74e591f
Add messages to get the number of threads for openmp
2015-05-19 14:54:42 +01:00
Peter Boyle
a6e1ea216d
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
e841395dfd
Updating preparing for solvers etc..
2015-05-16 23:35:08 +01:00
Peter Boyle
25bfa7e830
more digits
2015-05-16 04:33:40 +01:00
Peter Boyle
87bc17831d
Added su3 matrix benchmark.
2015-05-15 14:41:19 +01:00
Peter Boyle
c99922b591
Out of source compile now working
2015-05-15 12:21:40 +01:00
Peter Boyle
bc3889ffa1
Remove debug masking
2015-05-15 11:51:15 +01:00
Peter Boyle
7f3ae64a31
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
b4a570477c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
52174da232
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
a411b48a91
Adding a better controlled threading class, preparing to
...
force in deterministic reduction.
2015-05-11 18:59:03 +01:00
Peter Boyle
ebcb87abe1
Got command line args working
2015-05-11 14:36:48 +01:00
paboyle
fa5779537c
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
352bccf6ca
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
a115f3b086
ET ready benchmark with bytes counted assuming loop interchange
2015-05-10 15:18:04 +01:00
Peter Boyle
5415180676
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
7b0dd6c5d6
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
b720222d98
Updated bandwidth test
2015-05-05 18:08:53 +01:00
Peter Boyle
0e8415de1b
Added a makefile
2015-05-05 17:56:42 +01:00
Peter Boyle
9d93d1e6d4
Comms and memory benchmarks added
2015-05-03 09:44:47 +01:00
Peter Boyle
253362f978
Added a comms benchmark
2015-05-02 23:51:43 +01:00
Peter Boyle
ea52562527
Added a comms benchmark
2015-05-02 23:42:30 +01:00
Peter Boyle
6a39089a43
Starting a benchmarking sub dir
2015-05-02 17:52:36 +01:00