1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 01:35:36 +00:00
Commit Graph

1771 Commits

Author SHA1 Message Date
paboyle
1a6c7204ac Disable instantiation; Use cache version instead 2016-07-15 00:34:39 +01:00
paboyle
49310fbab3 Done with red black change over 2016-07-15 00:08:43 +01:00
paboyle
6049d5ac47 Update 2016-07-15 00:08:32 +01:00
paboyle
35d0d35238 Updated file list 2016-07-15 00:02:53 +01:00
paboyle
c0e878705e Updated file list 2016-07-15 00:02:39 +01:00
paboyle
5c0c8efb9e Updated file list 2016-07-15 00:02:11 +01:00
paboyle
dfd714e1ef Multiple implementations for the 5d hopping terms, depending on cache friendly
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62 Rewrite for performance. Impl dependent instantiations give
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above

-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
   and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2 5d ls vec rename of impl class 2016-07-14 23:57:26 +01:00
paboyle
a307274c96 Fermion impl rename for ls vectorised 5d approaches 2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe Updating the class to 5d selection based on impl type 2016-07-14 23:55:26 +01:00
paboyle
48fb1cdc11 Update domain 5d vectorised impl type, move the type over to 4d redblack with
the dense OO inverse
2016-07-14 23:54:35 +01:00
paboyle
8a79e93cc2 Rename the 5d domain wall fermion vectorised Ls impl class 2016-07-14 23:53:00 +01:00
paboyle
3493b51879 Modest updates 2016-07-14 23:52:13 +01:00
paboyle
de3e79d300 red black for Ls vectorised is 4d red black. Update accordingly now I've made this choice 2016-07-14 23:49:42 +01:00
paboyle
dd62a61c5c Added broadcast and rotation of simd vectors 2016-07-14 23:49:00 +01:00
paboyle
8f47d0b5ab Rotation needed for hopping term in fifth dim with Ls vectorised fields 2016-07-14 23:45:36 +01:00
paboyle
42af132dab Fix for chris kellys request to peek poke on checkerboarded fields 2016-07-14 23:44:48 +01:00
paboyle
9db2c6525d updating benchmarks for red black 4d for Ls vectorised code 2016-07-14 23:44:02 +01:00
paboyle
adbc7c1188 Adding files for multiple implementations (cache opt) and Ls vectorisation
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.

The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.

This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.

Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
paboyle
62601bb649 Bug fix 2016-07-08 20:46:29 +01:00
paboyle
ef97e32152 Adding persistent communicators 2016-07-08 17:16:08 +01:00
paboyle
c667d9fdcc Trying to make compile clean on travis; seem to have a make -j 4 problem with fftw 2016-07-07 23:26:39 +01:00
paboyle
7dbb94bab2 Update 2016-07-07 22:51:37 +01:00
paboyle
236dcc820b typo fix 2016-07-07 22:46:11 +01:00
paboyle
a42a441a6a Rename the reconfigure script to ./autogen.sh 2016-07-07 22:35:45 +01:00
paboyle
a0676beeb1 Open up dependency on Eigen and FFTW 2016-07-07 22:31:07 +01:00
paboyle
fc4a043663 Colors and banner clean up 2016-07-02 16:15:38 +01:00
paboyle
61ba50665e Merge branch 'hotfix/v0.5.1' into develop 2016-07-01 16:34:30 +01:00
paboyle
bfe14000a9 Double compile fix 2016-07-01 16:33:51 +01:00
paboyle
1ceff48133 Merge branch 'release/v0.5.0' into develop 2016-06-30 15:15:59 -07:00
paboyle
680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
paboyle
3fc6e03ad1 Version file 2016-06-30 14:44:09 -07:00
paboyle
2d6614f3a1 Merge branch 'feature/knl-cache-opt' into develop 2016-06-30 14:36:20 -07:00
paboyle
4e041b5103 Merge branch 'feature/knl-cache-opt' of https://github.com/paboyle/Grid into feature/knl-cache-opt 2016-06-30 14:36:08 -07:00
paboyle
712b9a3489 Asm only for avx512 2016-06-30 14:35:02 -07:00
paboyle
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
paboyle
1445189361 COntrol the prefetch strategy 2016-06-30 14:35:02 -07:00
paboyle
05c884a62a Prefetch change 2016-06-30 14:35:01 -07:00
paboyle
a25bec87d9 Prefetch during save 2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle
51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle
c8b35d960c Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt 2016-06-30 14:30:49 -07:00
paboyle
532f41dd61 Asm only for avx512 2016-06-30 14:00:34 -07:00
paboyle
661b0ab45d Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 13:07:42 -07:00
paboyle
4bc08ed995 Improved the prefetching when using cache blocking codes 2016-06-26 12:54:14 -07:00
paboyle
b2933a0557 COntrol the prefetch strategy 2016-06-25 12:55:25 -07:00
paboyle
db057cc276 Prefetch change 2016-06-25 12:54:50 -07:00