1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 09:45:36 +00:00
Commit Graph

1768 Commits

Author SHA1 Message Date
paboyle
757b31ed42 Threading for KNC mods. 2015-11-04 03:22:14 -08:00
paboyle
5aafdd7e1a Inline asm for KNL, KNC, Skylake Xeon 2015-11-04 03:21:15 -08:00
paboyle
ac7d1f26ad Either blocking or lebesgue curve 2015-11-04 03:19:16 -08:00
paboyle
1a8bf938b3 Use either sub-blocking or lebesgue 2015-11-04 03:18:51 -08:00
paboyle
63a2993827 Exec info an cache blocking 2015-11-04 03:16:56 -08:00
paboyle
4e65ad21ac Adding a routine for AVX512 / IMCI with explicit assembly implementations 2015-11-04 03:15:08 -08:00
Peter Boyle
dfc1de6f60 Merge branch 'master' of github.com:paboyle/Grid 2015-11-04 05:14:26 -06:00
Peter Boyle
f87526a04f Make ICC happy 2015-11-04 05:14:03 -06:00
Peter Boyle
3b7576ad53 Switch off for now 2015-11-04 05:13:29 -06:00
paboyle
9b5d31ffc1 mac , mult routines
Lines# with '#' will be ignored, and an empty message aborts the commit.
2015-11-04 03:10:34 -08:00
paboyle
a38762159c Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.
Updated Grid_avx512.h
2015-11-04 03:09:06 -08:00
Peter Boyle
ffc5dab17f AMD FMA4 support added for Interlagos/BlueWaters 2015-11-04 04:29:58 -06:00
Peter Boyle
96608c70d1 chrono causing some problems on Cray systems. Suspend use for now 2015-11-04 04:28:31 -06:00
Peter Boyle
d35d63b171 Algorithm in 2015-11-04 04:27:44 -06:00
Peter Boyle
9183920e8b Added an even odd stencil test, shook out a problem with spread out x-direction.
Generalise test to allow different types of "Field" to be used.
2015-11-04 10:03:04 +00:00
Peter Boyle
01f286c9fe Better testing for red black cshift which was sufficient to chase down a spread out x-direction problem. 2015-11-04 10:02:17 +00:00
Peter Boyle
24044dbc56 Debugged a problem with checkerboarded cshift in the checker dimension which arose
only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this
2015-11-04 10:00:55 +00:00
Peter Boyle
abb23df83f formatting only 2015-11-04 10:00:27 +00:00
Peter Boyle
12c5ec813c Useful debug messages (commented out) are included for preservation in case I need to revisit this 2015-11-04 09:59:27 +00:00
Peter Boyle
1271508ca2 Bug fix for spread out in x (EO) direction.
This is really annoying -- it is very hard to thread the loops with the index
recursion on buffer offset in the red-black case. Must think of a good threading
solution here.
2015-11-04 09:57:57 +00:00
Peter Boyle
ec5af35166 EO bug fix when spread out in x-direction 2015-11-04 09:56:58 +00:00
Peter Boyle
b3d70a3bb2 Ncall change 2015-11-04 09:55:21 +00:00
Peter Boyle
c26220e9ab EO benchmark as well as non-eo 2015-11-04 09:54:48 +00:00
Peter Boyle
0f59356e86 Problem in comms fixed 2015-11-02 00:00:15 +00:00
538b16610b First commit for measurement software 'Hadrons' 2015-10-27 17:33:18 +00:00
8709117aea Log: generalised Logger class to allow separate logs in Grid-based applications 2015-10-27 17:31:13 +00:00
1b22ce5720 tests Make.inc fix 2015-10-27 10:47:52 +00:00
e6b9aa9076 Config.h removed form repository 2015-10-27 10:47:07 +00:00
d9f2e2e06a Merge pull request #2 from paboyle/master
Update from Peter
2015-10-19 14:52:52 +01:00
Peter Boyle
41299da406 files added 2015-10-09 01:01:46 +02:00
Peter Boyle
8889af45ca FMA4 added 2015-10-09 01:00:53 +02:00
Peter Boyle
d4289a33b8 AMD FMA4 addition 2015-10-09 00:44:20 +02:00
Peter Boyle
83afb2e26a Poly support for lanczos 2015-10-09 00:43:21 +02:00
Peter Boyle
3726fe7481 Bigger vec length 2015-10-09 00:42:54 +02:00
Peter Boyle
6d06bd9493 Minor change in commented out code 2015-10-09 00:42:21 +02:00
Peter Boyle
6ee23f409e Lanczos addition 2015-10-09 00:41:00 +02:00
Peter Boyle
2d95dac6b6 Lanczos untested/partially tested additions. In middle of shake out but at least compiles 2015-10-09 00:40:25 +02:00
Peter Boyle
44fecd4d8d Lanczos test 2015-10-09 00:39:21 +02:00
Peter Boyle
814c79f38d SIMD improvements for mac and madd use in complex for avx, sse 2015-10-09 00:38:52 +02:00
paboyle
1878bf97d0 Babbage fix 2015-09-30 16:04:01 -07:00
paboyle
3a478e5f2a No compile babbage fix 2015-09-30 16:03:05 -07:00
paboyle
a660ce716b No compile babbage fix 2015-09-30 16:02:44 -07:00
paboyle
f4b6d1dfea NGO stores reenabled 2015-09-30 16:02:14 -07:00
paboyle
23813ac798 No compile on babbage fix 2015-09-30 16:01:28 -07:00
paboyle
af89c40462 Better timing tweaks to give sensible results on 24 threads on Edison dual ivybridge nodes. 2015-09-28 16:09:04 -07:00
Peter Boyle
9f4f65cb46 Added a decoupled memory system benchmark to remove thread synch overhead 2015-09-26 18:23:57 -07:00
Peter Boyle
64d64d1ab6 Updating to modify non-inlining permute routines and hopefully get better reg use and
enhance performance.
2015-09-25 08:55:04 -07:00
Peter Boyle
5ef42add2d Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00
Peter Boyle
2f38ebc446 Reintroducing the hand unrolled loops 2015-09-08 17:45:30 +01:00
Peter Boyle
638d6675ee Tested rms dH is ~ dt^4 numerically, so believe the ForceGradient is correct now.
Paranoia makes me want to diddle with the FG step to ensure dt^2 reappears.
2015-08-31 16:33:20 +01:00