Peter Boyle
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
489b1b9633
Schur complement based red-black inversion working
2015-05-25 13:47:12 +01:00
azusayamaguchi
2d2da8364f
Merge branch 'master' of https://github.com/paboyle/Grid
2015-05-19 14:55:26 +01:00
azusayamaguchi
91f29d4a68
Add messages to get the number of threads for openmp
2015-05-19 14:54:42 +01:00
Peter Boyle
4dba8522a1
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
ed8e3b676f
Remove debug masking
2015-05-15 11:51:15 +01:00
Peter Boyle
e179828662
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
48f425d31c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
6cec662ac5
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
22d384b07d
Adding a better controlled threading class, preparing to
...
force in deterministic reduction.
2015-05-11 18:59:03 +01:00
Peter Boyle
f5dcca7b1b
Got command line args working
2015-05-11 14:36:48 +01:00
paboyle
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
55ccb8ccf4
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
f663be2a6c
Added a comms benchmark
2015-05-02 23:42:30 +01:00
Peter Boyle
4a1d4f1b3c
Starting a benchmarking sub dir
2015-05-02 17:52:36 +01:00