paboyle
51cb2d4328
update file lists
2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle
c8b35d960c
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:30:49 -07:00
paboyle
532f41dd61
Asm only for avx512
2016-06-30 14:00:34 -07:00
paboyle
661b0ab45d
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 13:07:42 -07:00
Guido Cossu
565e9329ba
Changed the colouring classes
2016-06-30 16:51:03 +01:00
paboyle
4bc08ed995
Improved the prefetching when using cache blocking codes
2016-06-26 12:54:14 -07:00
paboyle
b2933a0557
COntrol the prefetch strategy
2016-06-25 12:55:25 -07:00
paboyle
db057cc276
Prefetch change
2016-06-25 12:54:50 -07:00
paboyle
22e88eaf54
Prefetch during save
2016-06-25 12:54:14 -07:00
paboyle
09fe3caebd
Tweaks
2016-06-25 11:08:05 -07:00
Guido Cossu
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle
17a8f51a9b
update file lists
2016-06-19 11:59:10 -07:00
paboyle
1b7f88dd00
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-19 11:45:58 -07:00
d6737e4bd8
Travis fix for Linux clang builds
2016-06-14 19:15:08 +01:00
75fc295f6e
Merge branch 'hadrons' into feature/hadrons
2016-06-14 17:51:15 +01:00
d539888e57
Merge pull request #37 from rprollins/fix/mpi_communicator
...
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 17:25:40 +01:00
Richard Rollins
86187d7cca
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 15:34:20 +01:00
paboyle
87418e7df1
Slightly faster prefetching perf.
2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3
Prefetching in the normal kernel implementation.
2016-06-08 05:43:48 -07:00
paboyle
05acc22920
placeholder for non temporal loads optimisation
2016-06-07 13:18:21 -07:00
paboyle
8ac021de73
Added a test an fixed it for red black precon Ls innermost vectorised DWF
2016-06-07 13:16:56 -07:00
paboyle
e503ef5590
Cleaned up
2016-06-07 00:11:36 +01:00
paboyle
a7682b0060
Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS
2016-06-06 23:48:21 +01:00
0b731b5d80
Hadrons: genetic scheduler parameter fix
2016-06-06 17:46:53 +01:00
8e2078be71
Hadrons: environment with fully generic object store
2016-06-06 17:45:37 +01:00
paboyle
d4c9d71fc8
Merge branch 'master' of https://github.com/paboyle/Grid
2016-06-06 07:06:54 -07:00
paboyle
786ca52c43
Problems remain in the red black preconditioning of the Ls vectorisation
2016-06-06 07:05:51 -07:00
Peter Boyle
048ac04abc
Update Benchmark_dwf.cc
2016-06-03 13:44:41 +01:00
Peter Boyle
f78d89bcbe
Update Lebesgue.cc
...
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0
Compiling updates for KNL
2016-06-03 03:47:54 -07:00
paboyle
5d3a1a025d
timers flag
2016-06-03 03:25:38 -07:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
1826ed06a3
Merge branch 'master' into hadrons
2016-05-27 16:50:31 +01:00
1c0e922585
Merge pull request #35 from aportelli/master
...
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe
empty SIMD fix
2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457
Merge pull request #34 from aportelli/master
...
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
3ff96c502b
Merge branch 'master' into hadrons
2016-05-12 19:24:18 +01:00
91e04056f9
fix of the empty SIMD
2016-05-12 19:24:10 +01:00
15a0908bfc
Merge branch 'master' into hadrons
2016-05-12 18:35:46 +01:00
3789e3f31c
additional fixed in slice functions
2016-05-12 18:35:38 +01:00
bb2125962b
Hadrons: finished implementation of 5D quarks
2016-05-12 18:34:42 +01:00
232fda5fe1
Hadrons: DWF action
2016-05-12 18:34:10 +01:00
2b31bf61ff
Hadrons: message fix
2016-05-12 18:33:49 +01:00
afe5a94745
Hadrons: getModule with upcast
2016-05-12 18:33:36 +01:00
7ae667c767
Hadrons: module template update
2016-05-12 18:33:08 +01:00
07f0b69784
Merge branch 'master' into hadrons
2016-05-12 13:02:18 +01:00
0c66719210
const fix in slice functions
2016-05-12 13:01:35 +01:00
5c06e89d69
Hadrons: code cleaning
2016-05-12 12:49:49 +01:00