1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-16 18:55:37 +00:00
Commit Graph

5670 Commits

Author SHA1 Message Date
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle
c8b35d960c Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt 2016-06-30 14:30:49 -07:00
paboyle
532f41dd61 Asm only for avx512 2016-06-30 14:00:34 -07:00
paboyle
661b0ab45d Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 13:07:42 -07:00
Guido Cossu
565e9329ba Changed the colouring classes 2016-06-30 16:51:03 +01:00
paboyle
4bc08ed995 Improved the prefetching when using cache blocking codes 2016-06-26 12:54:14 -07:00
paboyle
b2933a0557 COntrol the prefetch strategy 2016-06-25 12:55:25 -07:00
paboyle
db057cc276 Prefetch change 2016-06-25 12:54:50 -07:00
paboyle
22e88eaf54 Prefetch during save 2016-06-25 12:54:14 -07:00
paboyle
09fe3caebd Tweaks 2016-06-25 11:08:05 -07:00
Guido Cossu
5e02392f9c Fixed compilation error for benchmark_dwf
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle
17a8f51a9b update file lists 2016-06-19 11:59:10 -07:00
paboyle
1b7f88dd00 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-19 11:45:58 -07:00
d6737e4bd8 Travis fix for Linux clang builds 2016-06-14 19:15:08 +01:00
75fc295f6e Merge branch 'hadrons' into feature/hadrons 2016-06-14 17:51:15 +01:00
d539888e57 Merge pull request #37 from rprollins/fix/mpi_communicator
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 17:25:40 +01:00
Richard Rollins
86187d7cca Removed write to stdout in constructor for MPI CartesianCommunicator 2016-06-14 15:34:20 +01:00
paboyle
87418e7df1 Slightly faster prefetching perf. 2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3 Prefetching in the normal kernel implementation. 2016-06-08 05:43:48 -07:00
paboyle
05acc22920 placeholder for non temporal loads optimisation 2016-06-07 13:18:21 -07:00
paboyle
8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle
e503ef5590 Cleaned up 2016-06-07 00:11:36 +01:00
paboyle
a7682b0060 Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS 2016-06-06 23:48:21 +01:00
0b731b5d80 Hadrons: genetic scheduler parameter fix 2016-06-06 17:46:53 +01:00
8e2078be71 Hadrons: environment with fully generic object store 2016-06-06 17:45:37 +01:00
paboyle
d4c9d71fc8 Merge branch 'master' of https://github.com/paboyle/Grid 2016-06-06 07:06:54 -07:00
paboyle
786ca52c43 Problems remain in the red black preconditioning of the Ls vectorisation 2016-06-06 07:05:51 -07:00
Peter Boyle
048ac04abc Update Benchmark_dwf.cc 2016-06-03 13:44:41 +01:00
Peter Boyle
f78d89bcbe Update Lebesgue.cc
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle
5d3a1a025d timers flag 2016-06-03 03:25:38 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
1826ed06a3 Merge branch 'master' into hadrons 2016-05-27 16:50:31 +01:00
1c0e922585 Merge pull request #35 from aportelli/master
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe empty SIMD fix 2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457 Merge pull request #34 from aportelli/master
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
3ff96c502b Merge branch 'master' into hadrons 2016-05-12 19:24:18 +01:00
91e04056f9 fix of the empty SIMD 2016-05-12 19:24:10 +01:00
15a0908bfc Merge branch 'master' into hadrons 2016-05-12 18:35:46 +01:00
3789e3f31c additional fixed in slice functions 2016-05-12 18:35:38 +01:00
bb2125962b Hadrons: finished implementation of 5D quarks 2016-05-12 18:34:42 +01:00
232fda5fe1 Hadrons: DWF action 2016-05-12 18:34:10 +01:00
2b31bf61ff Hadrons: message fix 2016-05-12 18:33:49 +01:00
afe5a94745 Hadrons: getModule with upcast 2016-05-12 18:33:36 +01:00
7ae667c767 Hadrons: module template update 2016-05-12 18:33:08 +01:00
07f0b69784 Merge branch 'master' into hadrons 2016-05-12 13:02:18 +01:00
0c66719210 const fix in slice functions 2016-05-12 13:01:35 +01:00
5c06e89d69 Hadrons: code cleaning 2016-05-12 12:49:49 +01:00
3d75e0f0d1 Hadrons: MQuark fix 2016-05-12 12:02:15 +01:00