Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine
per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure
repro across checkpoints.
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01
Substantial effort on timing and logging infrastructure
but non-identity matrix
l1 0 0 0 ....
0 l2 0 0 ....
0 0 l3 0 ...
. . .
. . .
. . .
And apply the multishift CG to it. Sum the poles and residues.
Insist that this be the same as the exactly taken square root
where l1,l2,l3 >= 0.
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.