1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-10 07:55:35 +00:00
Commit Graph

1724 Commits

Author SHA1 Message Date
paboyle
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
paboyle
1445189361 COntrol the prefetch strategy 2016-06-30 14:35:02 -07:00
paboyle
05c884a62a Prefetch change 2016-06-30 14:35:01 -07:00
paboyle
a25bec87d9 Prefetch during save 2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle
51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
d6737e4bd8 Travis fix for Linux clang builds 2016-06-14 19:15:08 +01:00
d539888e57 Merge pull request #37 from rprollins/fix/mpi_communicator
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 17:25:40 +01:00
Richard Rollins
86187d7cca Removed write to stdout in constructor for MPI CartesianCommunicator 2016-06-14 15:34:20 +01:00
paboyle
87418e7df1 Slightly faster prefetching perf. 2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3 Prefetching in the normal kernel implementation. 2016-06-08 05:43:48 -07:00
paboyle
05acc22920 placeholder for non temporal loads optimisation 2016-06-07 13:18:21 -07:00
paboyle
8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle
e503ef5590 Cleaned up 2016-06-07 00:11:36 +01:00
paboyle
a7682b0060 Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS 2016-06-06 23:48:21 +01:00
paboyle
d4c9d71fc8 Merge branch 'master' of https://github.com/paboyle/Grid 2016-06-06 07:06:54 -07:00
paboyle
786ca52c43 Problems remain in the red black preconditioning of the Ls vectorisation 2016-06-06 07:05:51 -07:00
Peter Boyle
048ac04abc Update Benchmark_dwf.cc 2016-06-03 13:44:41 +01:00
Peter Boyle
f78d89bcbe Update Lebesgue.cc
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle
5d3a1a025d timers flag 2016-06-03 03:25:38 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
1c0e922585 Merge pull request #35 from aportelli/master
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe empty SIMD fix 2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457 Merge pull request #34 from aportelli/master
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
91e04056f9 fix of the empty SIMD 2016-05-12 19:24:10 +01:00
3789e3f31c additional fixed in slice functions 2016-05-12 18:35:38 +01:00
0c66719210 const fix in slice functions 2016-05-12 13:01:35 +01:00
paboyle
3a5b5c8bec Save an old tar of tree 2016-05-12 03:20:17 -07:00
paboyle
fdbe071213 space added 2016-05-12 02:59:51 -07:00
4bc21ec7cb thread CL argument fix 2016-05-11 15:21:29 +01:00
e3083b6dfc Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab' 2016-05-11 15:20:41 +01:00
paboyle
ab89418658 Precision change going in; useful for mixed precision algorithms for example. 2016-05-11 15:18:47 +01:00
paboyle
28cd99882c Subslicing 2016-05-11 15:06:54 +01:00
paboyle
aceaee774c ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
Peter Boyle
f8f9fd6f22 Merge pull request #33 from aportelli/master
Travis for clang 3.8 + various updates/fixes
2016-05-05 22:57:13 +01:00
101aa769eb LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers 2016-05-04 12:15:31 -07:00
0bf99bfde5 log polish 2016-05-04 12:14:49 -07:00
64bf6fe54e macro to dump NERSC header to a stream 2016-05-04 12:14:38 -07:00
1161d566b9 minor code cleaning 2016-05-02 19:32:11 -07:00
c698b16d75 function to generate Chroma-style gamma matrix products 2016-05-01 18:30:35 -07:00
c4c89336fe SliceSum: shutting down warning about non-threaded code for now 2016-05-01 18:29:57 -07:00
fa59789580 ConjugateGradient: cleaner output 2016-05-01 18:29:20 -07:00
92c2c7d3b5 SchurRedBlackDiagMooeeSolve: fix: guess was not initialised from input 2016-05-01 16:07:55 -07:00
e99ce0875f directly exit when using '--help' option 2016-05-01 16:05:16 -07:00
cc1d9eb05b Merge commit '999b3a2e26bdd8300d389699dd299e7e5d951af6' 2016-05-01 15:55:22 -07:00
57c027fea2 Travis update 2016-05-01 15:04:52 -07:00