Peter Boyle
a32ac287bb
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
3358a77c7a
Better checkerboard tracking.
2015-05-25 13:45:08 +01:00
Peter Boyle
d8061afe24
Streaming store option ifdef
2015-05-21 06:47:05 +01:00
Peter Boyle
ffc00caea3
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
9f0e990b40
Optimisation and syntax pretty
2015-05-16 04:36:22 +01:00
Peter Boyle
8d77d758c3
Parallel for replace
2015-05-15 11:48:04 +01:00
Peter Boyle
add4495a4a
cout IO for all types
2015-05-13 09:24:10 +01:00
Peter Boyle
556befaaaa
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
c6baa3e657
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
6e6843ac69
Moving some things around for pretty
2015-05-11 19:09:49 +01:00