54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
6eea9e4da7
HDF5 types static initialisation is mysteriously buggy on BG/Q, changing strategy
2017-01-19 18:02:53 -08:00
5803933aea
First implementation of HDF5 serial IO writer, reader is still empty
2017-01-17 16:21:18 -08:00
791cb050c8
Comms improvements
2016-11-01 11:35:43 +00:00
b6a65059a2
Update to use shared memory to contain the stencil comms buffers
...
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
cbcfea466f
MPI3
2016-10-20 16:57:14 +01:00
2e74520821
removed libtool use (BG/Q compatibility)
2016-09-16 15:25:49 +01:00
9e5b934d21
improved LAPACK configuration
2016-08-02 17:26:54 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
3845f267cb
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
f2c70804ca
Bug in Makefile.am fixed
2015-05-31 18:50:08 +01:00
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
62a7ca462f
Works now with Clang-avx, Clang-sse and ICPC-avx, ICPC-sse
2015-05-28 11:35:43 +01:00
48bb3ab4e7
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/Grid_simd.h
2015-05-26 20:04:08 +01:00
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
9e29ac6549
Completed implementation of new Grid_simd classes
...
Tested performance for SSE4, Ok.
AVX1/2, AVX512 yet untested
2015-05-22 17:33:15 +09:00
74e91cd925
Partial implementation of the vector types SIMD
...
Implementing SSE4 now
A systematic series of tests must be written.
2015-05-19 17:21:17 +09:00
baa382f055
Added check of mpfr and gmp at configure time
...
It generates automatically the linker flags or complains if not found.
2015-05-19 13:54:55 +09:00
11cb3e9a01
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
6103c29ee3
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
38598190c3
Vectors now too and right multiple of matrix with gamma
2015-04-24 19:08:29 +01:00
6bd11d920a
Finishing the reorg
2015-04-18 21:24:10 +01:00
25a8266638
More files, shorter each.
2015-04-18 20:45:00 +01:00
cffad66894
Reorganise to keep files smaller
2015-04-18 18:36:48 +01:00
c656164015
Reorg of build structure
2015-04-18 14:55:00 +01:00