Peter Boyle
98d8ba6d14
Remove autogen files from CVS
2015-11-06 05:29:07 -06:00
Peter Boyle
27813cf518
More timing detail reported
2015-11-06 05:27:13 -06:00
Peter Boyle
955b482aaf
Partial optimisation of the extraction/merger of simd vecs.
2015-11-06 05:26:20 -06:00
Peter Boyle
f9b2fce93b
Changing whole stencil class to be template and not just single functions
2015-11-06 05:25:10 -06:00
Peter Boyle
473fa28a6c
Partial optimisation; comms in x-dir for red black dslash will be slow as the checker skipping block strided
...
loops are non threadable. Will need to write a kernel for these instead and drive them with a lookup table
to make a look sufficiently simple to thread.
2015-11-06 05:23:23 -06:00
Peter Boyle
5d854c869c
Stencil interface changes
2015-11-06 05:22:33 -06:00
Peter Boyle
880ff88362
Comms optimisation
2015-11-06 05:22:18 -06:00
Peter Boyle
f85b9ddd97
Remove nonfunctioning lanczos
2015-11-06 05:21:21 -06:00
Azusa Yamaguchi
4690acc3c8
Don't know why peter committed these as they didn't compile
2015-11-06 10:31:48 +00:00
Azusa Yamaguchi
3281745fde
Exec info and linux check to stop non-portable code breaking
2015-11-06 10:31:24 +00:00
Azusa Yamaguchi
c2d96644a0
EXEC INFO check
2015-11-06 10:31:05 +00:00
paboyle
16c7993434
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/simd/Grid_avx512.h
lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle
6be9716e6f
New file
2015-11-04 03:26:28 -08:00
paboyle
32762346ad
Better run time on KNC
2015-11-04 03:25:34 -08:00
paboyle
4a41c885ed
Use Linux kernel interface to hardware performance counters. Dead useful.
2015-11-04 03:24:19 -08:00
paboyle
0f48658a27
Update minor
2015-11-04 03:23:46 -08:00
paboyle
757b31ed42
Threading for KNC mods.
2015-11-04 03:22:14 -08:00
paboyle
5aafdd7e1a
Inline asm for KNL, KNC, Skylake Xeon
2015-11-04 03:21:15 -08:00
paboyle
ac7d1f26ad
Either blocking or lebesgue curve
2015-11-04 03:19:16 -08:00
paboyle
1a8bf938b3
Use either sub-blocking or lebesgue
2015-11-04 03:18:51 -08:00
paboyle
63a2993827
Exec info an cache blocking
2015-11-04 03:16:56 -08:00
paboyle
4e65ad21ac
Adding a routine for AVX512 / IMCI with explicit assembly implementations
2015-11-04 03:15:08 -08:00
Peter Boyle
dfc1de6f60
Merge branch 'master' of github.com:paboyle/Grid
2015-11-04 05:14:26 -06:00
Peter Boyle
f87526a04f
Make ICC happy
2015-11-04 05:14:03 -06:00
Peter Boyle
3b7576ad53
Switch off for now
2015-11-04 05:13:29 -06:00
paboyle
9b5d31ffc1
mac , mult routines
...
Lines# with '#' will be ignored, and an empty message aborts the commit.
2015-11-04 03:10:34 -08:00
paboyle
a38762159c
Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.
...
Updated Grid_avx512.h
2015-11-04 03:09:06 -08:00
Peter Boyle
ffc5dab17f
AMD FMA4 support added for Interlagos/BlueWaters
2015-11-04 04:29:58 -06:00
Peter Boyle
96608c70d1
chrono causing some problems on Cray systems. Suspend use for now
2015-11-04 04:28:31 -06:00
Peter Boyle
d35d63b171
Algorithm in
2015-11-04 04:27:44 -06:00
Peter Boyle
9183920e8b
Added an even odd stencil test, shook out a problem with spread out x-direction.
...
Generalise test to allow different types of "Field" to be used.
2015-11-04 10:03:04 +00:00
Peter Boyle
01f286c9fe
Better testing for red black cshift which was sufficient to chase down a spread out x-direction problem.
2015-11-04 10:02:17 +00:00
Peter Boyle
24044dbc56
Debugged a problem with checkerboarded cshift in the checker dimension which arose
...
only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this
2015-11-04 10:00:55 +00:00
Peter Boyle
abb23df83f
formatting only
2015-11-04 10:00:27 +00:00
Peter Boyle
12c5ec813c
Useful debug messages (commented out) are included for preservation in case I need to revisit this
2015-11-04 09:59:27 +00:00
Peter Boyle
1271508ca2
Bug fix for spread out in x (EO) direction.
...
This is really annoying -- it is very hard to thread the loops with the index
recursion on buffer offset in the red-black case. Must think of a good threading
solution here.
2015-11-04 09:57:57 +00:00
Peter Boyle
ec5af35166
EO bug fix when spread out in x-direction
2015-11-04 09:56:58 +00:00
Peter Boyle
b3d70a3bb2
Ncall change
2015-11-04 09:55:21 +00:00
Peter Boyle
c26220e9ab
EO benchmark as well as non-eo
2015-11-04 09:54:48 +00:00
Peter Boyle
0f59356e86
Problem in comms fixed
2015-11-02 00:00:15 +00:00
Peter Boyle
41299da406
files added
2015-10-09 01:01:46 +02:00
Peter Boyle
8889af45ca
FMA4 added
2015-10-09 01:00:53 +02:00
Peter Boyle
d4289a33b8
AMD FMA4 addition
2015-10-09 00:44:20 +02:00
Peter Boyle
83afb2e26a
Poly support for lanczos
2015-10-09 00:43:21 +02:00
Peter Boyle
3726fe7481
Bigger vec length
2015-10-09 00:42:54 +02:00
Peter Boyle
6d06bd9493
Minor change in commented out code
2015-10-09 00:42:21 +02:00
Peter Boyle
6ee23f409e
Lanczos addition
2015-10-09 00:41:00 +02:00
Peter Boyle
2d95dac6b6
Lanczos untested/partially tested additions. In middle of shake out but at least compiles
2015-10-09 00:40:25 +02:00
Peter Boyle
44fecd4d8d
Lanczos test
2015-10-09 00:39:21 +02:00
Peter Boyle
814c79f38d
SIMD improvements for mac and madd use in complex for avx, sse
2015-10-09 00:38:52 +02:00