Peter Boyle
b5a6e4f1fd
Best option for Xeon cache blocking set
2017-06-30 10:53:22 +01:00
Peter Boyle
38325ebbc6
Interleave code path; not enabled
2017-06-30 10:23:51 +01:00
35fa3d1dfd
Merge branch 'master' into feature/scalar_adjointFT
2017-05-12 10:41:39 +01:00
paboyle
2439999ec8
Warning elimination; drop to -O2 on G++ bad versions
2017-05-06 14:44:49 +01:00
Guido Cossu
741bc836f6
Exposing support for Ncolours and Ndimensions and JSON input file for the ScalarAction
2017-05-05 17:36:43 +01:00
Peter Boyle
99220f6531
Fixes and better timing
2017-04-26 17:24:11 -04:00
paboyle
ab66bac4e6
Think I'm getting on top of the reduced cost exterior precomputed list of links
2017-04-25 08:50:26 +01:00
paboyle
56277a11c8
Build a list of whats on the surface
2017-04-24 17:06:15 +01:00
paboyle
3703b718aa
Mark up a table if a given site only receives from itself; including MPI3 splitting info.
2017-04-22 19:28:37 +01:00
paboyle
736bf3c866
Major rework of stencil. Half precision and MPI3 now working.
2017-04-22 11:33:50 +01:00
paboyle
e1a2319d01
Simple compressor moved out of cshift into stencil
2017-04-20 13:18:15 +01:00
paboyle
d2312e9874
Drop compressor entirely from Cshift to only Stencil.
2017-04-20 13:16:55 +01:00
paboyle
fc4ab9ccd5
Working half precision comms
2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca
Massive compressor rework to support reduced precision comms
2017-04-20 09:28:27 +01:00
8ef4300412
spurious .dirstamp files removed
2017-04-10 17:00:22 +01:00
paboyle
8c8473998d
Average over whole cluster the comm time.
2017-03-21 22:29:51 -04:00
paboyle
4e7ab3166f
Refactoring header layout
2017-02-22 18:09:33 +00:00
paboyle
aca7a3ef0a
Optimisation control improvements
2017-02-10 18:22:31 -05:00
c56707e003
useless debug message removed
2016-12-07 08:59:20 +09:00
paboyle
bb94ddd0eb
Tidy up of mpi3; also some cleaning of the dslash controls.
2016-11-02 08:07:09 +00:00
paboyle
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
paboyle
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
Peter Boyle
f78d89bcbe
Update Lebesgue.cc
...
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
paboyle
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
Peter Boyle
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
Jung
5c57d4f403
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
Jung
5924e5a562
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
configure
lib/qcd/action/Actions.h
lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
Peter Boyle
f9b2fce93b
Changing whole stencil class to be template and not just single functions
2015-11-06 05:25:10 -06:00
paboyle
16c7993434
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/simd/Grid_avx512.h
lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle
ac7d1f26ad
Either blocking or lebesgue curve
2015-11-04 03:19:16 -08:00
paboyle
1a8bf938b3
Use either sub-blocking or lebesgue
2015-11-04 03:18:51 -08:00
Peter Boyle
24044dbc56
Debugged a problem with checkerboarded cshift in the checker dimension which arose
...
only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this
2015-11-04 10:00:55 +00:00
Peter Boyle
0f59356e86
Problem in comms fixed
2015-11-02 00:00:15 +00:00
Peter Boyle
7d3512ab21
Gparity valence test now working.
...
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
neo
48bf4878c1
Experimental support for ARM
2015-06-09 15:46:21 +09:00
Azusa Yamaguchi
58a4f32298
merge to the head
2015-06-05 10:15:31 +01:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Peter Boyle
3845f267cb
Domain wall fermions now invert ; have the basis set up for
...
Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson)
Approx Representation Kernel.
All are done with space-time taking part in checkerboarding, Ls uncheckerboarded
Have only so far tested the Domain Wall limit of mobius, and at that only checked
that it
i) Inverts
ii) 5dim DW == Ls copies of 4dim D2
iii) MeeInv Mee == 1
iv) Meo+Mee+Moe+Moo == M unprec.
v) MpcDagMpc is hermitan
vi) Mdag is the adjoint of M between stochastic vectors.
That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve
all converge and the true residual becomes small; so pretty good tests.
2015-06-02 16:57:12 +01:00
Peter Boyle
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
neo
74e91cd925
Partial implementation of the vector types SIMD
...
Implementing SSE4 now
A systematic series of tests must be written.
2015-05-19 17:21:17 +09:00
neo
baa382f055
Added check of mpfr and gmp at configure time
...
It generates automatically the linker flags or complains if not found.
2015-05-19 13:54:55 +09:00
neo
b4cd37276b
Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.
2015-05-18 16:48:14 +09:00
Peter Boyle
a98c01c86a
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
c72db6c6f6
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
25d523c0f4
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
e5a25dfcb1
Build reorg with which I am a bit happier
2015-04-18 21:22:50 +01:00