b15db11c60
Kernels -> pure static object to enable device execution
2018-03-24 19:35:20 -04:00
334bb6792f
Lebesgue order removed. Stick in the stencil view
2018-03-22 18:12:12 -04:00
8a1d303ab9
GPU friendly stencil improvements
2018-03-19 07:11:03 -04:00
3277bda130
View introduction to prepare for accelerator offload.
...
Probably same problem exists for stencil object
2018-03-04 16:38:08 +00:00
a935ef7b39
Namespace
2018-01-14 23:01:07 +00:00
fc4ab9ccd5
Working half precision comms
2017-04-20 11:20:26 +01:00
e7c36771ed
ZMobius prep for asm
2017-03-15 14:23:33 -04:00
e099dcdae7
Merge branch 'develop' into feature/bgq-asm
2017-02-23 00:25:29 +00:00
4e7ab3166f
Refactoring header layout
2017-02-22 18:09:33 +00:00
2c246551d0
Overlap comms and compute options in wilson kernels
2017-02-07 01:37:10 -05:00
fad743fbb1
Build system sanity check: corrected several headers not in the <Grid/*> format
2017-01-26 17:00:41 -08:00
04ae7929a3
BGQ or KNL assembler now
2016-12-22 17:53:22 +00:00
ae8561892e
Eliminating useless defines
2016-11-02 10:21:06 +00:00
e8c3174ae2
Small change in the defines
2016-10-30 12:23:11 +00:00
9b066e94d0
Compilation with both single and double precision
2016-10-30 12:04:06 +00:00
e1042aef77
First version of the doube prec for testing purposes
...
It does not compile single and double version at the same time
2016-10-28 17:20:04 +01:00
c190221fd3
Internal SHM comms in non-simd directions working
...
Need to fix simd directions
2016-10-22 18:14:27 +01:00
b58adc6a4b
commVector
2016-10-20 17:00:15 +01:00
c78bbd0f8c
Fix ASM compilation
2016-10-04 15:37:32 +01:00
f76f281e58
Cleaning files after fix
2016-09-09 11:34:25 +01:00
aa20cc8b52
Fixing compilation error with AVX512 flag
2016-09-09 02:58:52 -07:00
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
d9408893b3
Prefetching in the normal kernel implementation.
2016-06-08 05:43:48 -07:00
e503ef5590
Cleaned up
2016-06-07 00:11:36 +01:00
a7682b0060
Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS
2016-06-06 23:48:21 +01:00
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
c79ea0dcef
Fixingn IMCI
2016-04-22 21:52:54 -07:00
9b6ab6db16
simd in 5th dimension support
2016-04-19 15:38:01 -07:00
8052556275
Cleaning up the single/double kernel implementation switch
2016-03-31 14:51:32 +01:00
60d965f79e
AVX512 improvements; sigfpe trapping too
2016-03-30 08:42:34 +01:00
c77b7ee897
AddSub based alternate SU3 routine
2016-03-28 17:55:22 -06:00
21abaf7e91
Gamma sign change
2016-03-28 00:35:45 -06:00
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
644fd6d32e
Build avx512 clean
2016-03-25 09:35:33 -07:00
497e7e4c53
BG/Q compatibility fix
2016-02-23 15:57:38 +00:00
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
899ca41cb8
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
d29b4c1dee
Assembler files
2015-11-06 03:48:48 -08:00
28022755ae
Stencil class name global change to StencilImpl typedef
2015-11-06 05:30:17 -06:00
4e65ad21ac
Adding a routine for AVX512 / IMCI with explicit assembly implementations
2015-11-04 03:15:08 -08:00