paboyle
b2933a0557
COntrol the prefetch strategy
2016-06-25 12:55:25 -07:00
paboyle
db057cc276
Prefetch change
2016-06-25 12:54:50 -07:00
paboyle
22e88eaf54
Prefetch during save
2016-06-25 12:54:14 -07:00
paboyle
09fe3caebd
Tweaks
2016-06-25 11:08:05 -07:00
Guido Cossu
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle
17a8f51a9b
update file lists
2016-06-19 11:59:10 -07:00
paboyle
1b7f88dd00
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-19 11:45:58 -07:00
d6737e4bd8
Travis fix for Linux clang builds
2016-06-14 19:15:08 +01:00
d539888e57
Merge pull request #37 from rprollins/fix/mpi_communicator
...
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 17:25:40 +01:00
Richard Rollins
86187d7cca
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 15:34:20 +01:00
paboyle
87418e7df1
Slightly faster prefetching perf.
2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3
Prefetching in the normal kernel implementation.
2016-06-08 05:43:48 -07:00
paboyle
05acc22920
placeholder for non temporal loads optimisation
2016-06-07 13:18:21 -07:00
paboyle
8ac021de73
Added a test an fixed it for red black precon Ls innermost vectorised DWF
2016-06-07 13:16:56 -07:00
paboyle
e503ef5590
Cleaned up
2016-06-07 00:11:36 +01:00
paboyle
a7682b0060
Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS
2016-06-06 23:48:21 +01:00
paboyle
d4c9d71fc8
Merge branch 'master' of https://github.com/paboyle/Grid
2016-06-06 07:06:54 -07:00
paboyle
786ca52c43
Problems remain in the red black preconditioning of the Ls vectorisation
2016-06-06 07:05:51 -07:00
Peter Boyle
048ac04abc
Update Benchmark_dwf.cc
2016-06-03 13:44:41 +01:00
Peter Boyle
f78d89bcbe
Update Lebesgue.cc
...
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0
Compiling updates for KNL
2016-06-03 03:47:54 -07:00
paboyle
5d3a1a025d
timers flag
2016-06-03 03:25:38 -07:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
1c0e922585
Merge pull request #35 from aportelli/master
...
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe
empty SIMD fix
2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457
Merge pull request #34 from aportelli/master
...
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
91e04056f9
fix of the empty SIMD
2016-05-12 19:24:10 +01:00
3789e3f31c
additional fixed in slice functions
2016-05-12 18:35:38 +01:00
0c66719210
const fix in slice functions
2016-05-12 13:01:35 +01:00
paboyle
3a5b5c8bec
Save an old tar of tree
2016-05-12 03:20:17 -07:00
paboyle
fdbe071213
space added
2016-05-12 02:59:51 -07:00
4bc21ec7cb
thread CL argument fix
2016-05-11 15:21:29 +01:00
e3083b6dfc
Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab'
2016-05-11 15:20:41 +01:00
paboyle
ab89418658
Precision change going in; useful for mixed precision algorithms for example.
2016-05-11 15:18:47 +01:00
paboyle
28cd99882c
Subslicing
2016-05-11 15:06:54 +01:00
paboyle
aceaee774c
ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
...
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
Peter Boyle
f8f9fd6f22
Merge pull request #33 from aportelli/master
...
Travis for clang 3.8 + various updates/fixes
2016-05-05 22:57:13 +01:00
101aa769eb
LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers
2016-05-04 12:15:31 -07:00
0bf99bfde5
log polish
2016-05-04 12:14:49 -07:00
64bf6fe54e
macro to dump NERSC header to a stream
2016-05-04 12:14:38 -07:00
1161d566b9
minor code cleaning
2016-05-02 19:32:11 -07:00
c698b16d75
function to generate Chroma-style gamma matrix products
2016-05-01 18:30:35 -07:00
c4c89336fe
SliceSum: shutting down warning about non-threaded code for now
2016-05-01 18:29:57 -07:00
fa59789580
ConjugateGradient: cleaner output
2016-05-01 18:29:20 -07:00
92c2c7d3b5
SchurRedBlackDiagMooeeSolve: fix: guess was not initialised from input
2016-05-01 16:07:55 -07:00
e99ce0875f
directly exit when using '--help' option
2016-05-01 16:05:16 -07:00
cc1d9eb05b
Merge commit '999b3a2e26bdd8300d389699dd299e7e5d951af6'
2016-05-01 15:55:22 -07:00
57c027fea2
Travis update
2016-05-01 15:04:52 -07:00
207dc439a7
Travis debug
2016-05-01 15:00:35 -07:00