5803933aea
First implementation of HDF5 serial IO writer, reader is still empty
2017-01-17 16:21:18 -08:00
paboyle
791cb050c8
Comms improvements
2016-11-01 11:35:43 +00:00
azusayamaguchi
b6a65059a2
Update to use shared memory to contain the stencil comms buffers
...
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
paboyle
cbcfea466f
MPI3
2016-10-20 16:57:14 +01:00
2e74520821
removed libtool use (BG/Q compatibility)
2016-09-16 15:25:49 +01:00
9e5b934d21
improved LAPACK configuration
2016-08-02 17:26:54 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
Peter Boyle
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
3845f267cb
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
azusayamaguchi
f2c70804ca
Bug in Makefile.am fixed
2015-05-31 18:50:08 +01:00
Peter Boyle
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
62a7ca462f
Works now with Clang-avx, Clang-sse and ICPC-avx, ICPC-sse
2015-05-28 11:35:43 +01:00
Peter Boyle
48bb3ab4e7
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Grid_simd.h
2015-05-26 20:04:08 +01:00
Peter Boyle
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
neo
9e29ac6549
Completed implementation of new Grid_simd classes
...
Tested performance for SSE4, Ok.
AVX1/2, AVX512 yet untested
2015-05-22 17:33:15 +09:00
neo
74e91cd925
Partial implementation of the vector types SIMD
...
Implementing SSE4 now
A systematic series of tests must be written.
2015-05-19 17:21:17 +09:00
neo
baa382f055
Added check of mpfr and gmp at configure time
...
It generates automatically the linker flags or complains if not found.
2015-05-19 13:54:55 +09:00
Peter Boyle
11cb3e9a01
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
6103c29ee3
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
paboyle
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
38598190c3
Vectors now too and right multiple of matrix with gamma
2015-04-24 19:08:29 +01:00
Peter Boyle
6bd11d920a
Finishing the reorg
2015-04-18 21:24:10 +01:00
Peter Boyle
25a8266638
More files, shorter each.
2015-04-18 20:45:00 +01:00
Peter Boyle
cffad66894
Reorganise to keep files smaller
2015-04-18 18:36:48 +01:00
Peter Boyle
c656164015
Reorg of build structure
2015-04-18 14:55:00 +01:00