Azusa Yamaguchi
8f87950dc1
FIx miistake
2015-06-01 12:26:20 +01:00
Peter Boyle
20100d0a40
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
f6cade41b4
Better checkerboard tracking.
2015-05-25 13:45:08 +01:00
Peter Boyle
e0cc5ba920
Streaming store option ifdef
2015-05-21 06:47:05 +01:00
Peter Boyle
a6e1ea216d
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
5f8b82b90c
Optimisation and syntax pretty
2015-05-16 04:36:22 +01:00
Peter Boyle
e8efa6320e
Parallel for replace
2015-05-15 11:48:04 +01:00
Peter Boyle
d388b831b4
cout IO for all types
2015-05-13 09:24:10 +01:00
Peter Boyle
52174da232
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
65c91eae64
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
8b765be2b1
Moving some things around for pretty
2015-05-11 19:09:49 +01:00