1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 01:35:36 +00:00
Commit Graph

1193 Commits

Author SHA1 Message Date
Christopher Kelly
85ed8175cb Implemented mixed precision CG. Fixed filelist to exclude lib/Old directory and include Config.h. 2016-07-06 15:57:04 -04:00
Christopher Kelly
df5c788ef2 Merge branch 'develop' into feature/multi_prec 2016-07-06 14:52:28 -04:00
Christopher Kelly
15f22425c8 Added option to prevent CG from exiting when it fails to converge 2016-07-06 14:50:01 -04:00
paboyle
fc4a043663 Colors and banner clean up 2016-07-02 16:15:38 +01:00
paboyle
680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
paboyle
712b9a3489 Asm only for avx512 2016-06-30 14:35:02 -07:00
paboyle
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
paboyle
1445189361 COntrol the prefetch strategy 2016-06-30 14:35:02 -07:00
paboyle
05c884a62a Prefetch change 2016-06-30 14:35:01 -07:00
paboyle
a25bec87d9 Prefetch during save 2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle
51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu
5e02392f9c Fixed compilation error for benchmark_dwf
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
Richard Rollins
86187d7cca Removed write to stdout in constructor for MPI CartesianCommunicator 2016-06-14 15:34:20 +01:00
paboyle
87418e7df1 Slightly faster prefetching perf. 2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3 Prefetching in the normal kernel implementation. 2016-06-08 05:43:48 -07:00
paboyle
8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle
e503ef5590 Cleaned up 2016-06-07 00:11:36 +01:00
paboyle
a7682b0060 Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS 2016-06-06 23:48:21 +01:00
paboyle
d4c9d71fc8 Merge branch 'master' of https://github.com/paboyle/Grid 2016-06-06 07:06:54 -07:00
paboyle
786ca52c43 Problems remain in the red black preconditioning of the Ls vectorisation 2016-06-06 07:05:51 -07:00
Peter Boyle
f78d89bcbe Update Lebesgue.cc
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
1c0e922585 Merge pull request #35 from aportelli/master
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe empty SIMD fix 2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457 Merge pull request #34 from aportelli/master
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
91e04056f9 fix of the empty SIMD 2016-05-12 19:24:10 +01:00
3789e3f31c additional fixed in slice functions 2016-05-12 18:35:38 +01:00
0c66719210 const fix in slice functions 2016-05-12 13:01:35 +01:00
paboyle
3a5b5c8bec Save an old tar of tree 2016-05-12 03:20:17 -07:00
4bc21ec7cb thread CL argument fix 2016-05-11 15:21:29 +01:00
e3083b6dfc Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab' 2016-05-11 15:20:41 +01:00
paboyle
ab89418658 Precision change going in; useful for mixed precision algorithms for example. 2016-05-11 15:18:47 +01:00
paboyle
28cd99882c Subslicing 2016-05-11 15:06:54 +01:00
paboyle
aceaee774c ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
101aa769eb LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers 2016-05-04 12:15:31 -07:00
0bf99bfde5 log polish 2016-05-04 12:14:49 -07:00
64bf6fe54e macro to dump NERSC header to a stream 2016-05-04 12:14:38 -07:00
1161d566b9 minor code cleaning 2016-05-02 19:32:11 -07:00
c698b16d75 function to generate Chroma-style gamma matrix products 2016-05-01 18:30:35 -07:00
c4c89336fe SliceSum: shutting down warning about non-threaded code for now 2016-05-01 18:29:57 -07:00
fa59789580 ConjugateGradient: cleaner output 2016-05-01 18:29:20 -07:00
92c2c7d3b5 SchurRedBlackDiagMooeeSolve: fix: guess was not initialised from input 2016-05-01 16:07:55 -07:00
e99ce0875f directly exit when using '--help' option 2016-05-01 16:05:16 -07:00
paboyle
c23375cd65 Testing travis CI integration 2016-04-30 06:30:56 -07:00
paboyle
f7ca6ca889 Bernoulli reenabled -- using integral type for the discrete_distribution, but
then casts in the fill
2016-04-30 03:48:28 -07:00