1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-10 19:36:56 +01:00
Commit Graph

1772 Commits

Author SHA1 Message Date
980ff18956 Solving the instantiation no compile issue 2016-07-15 17:19:44 +01:00
1a6c7204ac Disable instantiation; Use cache version instead 2016-07-15 00:34:39 +01:00
49310fbab3 Done with red black change over 2016-07-15 00:08:43 +01:00
6049d5ac47 Update 2016-07-15 00:08:32 +01:00
35d0d35238 Updated file list 2016-07-15 00:02:53 +01:00
c0e878705e Updated file list 2016-07-15 00:02:39 +01:00
5c0c8efb9e Updated file list 2016-07-15 00:02:11 +01:00
dfd714e1ef Multiple implementations for the 5d hopping terms, depending on cache friendly
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
79a8ca1a62 Rewrite for performance. Impl dependent instantiations give
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above

-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
   and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
fb45eb2eb2 5d ls vec rename of impl class 2016-07-14 23:57:26 +01:00
a307274c96 Fermion impl rename for ls vectorised 5d approaches 2016-07-14 23:56:13 +01:00
3f2c44a5fe Updating the class to 5d selection based on impl type 2016-07-14 23:55:26 +01:00
48fb1cdc11 Update domain 5d vectorised impl type, move the type over to 4d redblack with
the dense OO inverse
2016-07-14 23:54:35 +01:00
8a79e93cc2 Rename the 5d domain wall fermion vectorised Ls impl class 2016-07-14 23:53:00 +01:00
3493b51879 Modest updates 2016-07-14 23:52:13 +01:00
de3e79d300 red black for Ls vectorised is 4d red black. Update accordingly now I've made this choice 2016-07-14 23:49:42 +01:00
dd62a61c5c Added broadcast and rotation of simd vectors 2016-07-14 23:49:00 +01:00
8f47d0b5ab Rotation needed for hopping term in fifth dim with Ls vectorised fields 2016-07-14 23:45:36 +01:00
42af132dab Fix for chris kellys request to peek poke on checkerboarded fields 2016-07-14 23:44:48 +01:00
9db2c6525d updating benchmarks for red black 4d for Ls vectorised code 2016-07-14 23:44:02 +01:00
adbc7c1188 Adding files for multiple implementations (cache opt) and Ls vectorisation
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.

The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.

This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.

Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
62601bb649 Bug fix 2016-07-08 20:46:29 +01:00
ef97e32152 Adding persistent communicators 2016-07-08 17:16:08 +01:00
c667d9fdcc Trying to make compile clean on travis; seem to have a make -j 4 problem with fftw 2016-07-07 23:26:39 +01:00
7dbb94bab2 Update 2016-07-07 22:51:37 +01:00
236dcc820b typo fix 2016-07-07 22:46:11 +01:00
a42a441a6a Rename the reconfigure script to ./autogen.sh 2016-07-07 22:35:45 +01:00
a0676beeb1 Open up dependency on Eigen and FFTW 2016-07-07 22:31:07 +01:00
fc4a043663 Colors and banner clean up 2016-07-02 16:15:38 +01:00
61ba50665e Merge branch 'hotfix/v0.5.1' into develop 2016-07-01 16:34:30 +01:00
bfe14000a9 Double compile fix 2016-07-01 16:33:51 +01:00
1ceff48133 Merge branch 'release/v0.5.0' into develop 2016-06-30 15:15:59 -07:00
680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
3fc6e03ad1 Version file v0.5.0 2016-06-30 14:44:09 -07:00
2d6614f3a1 Merge branch 'feature/knl-cache-opt' into develop 2016-06-30 14:36:20 -07:00
4e041b5103 Merge branch 'feature/knl-cache-opt' of https://github.com/paboyle/Grid into feature/knl-cache-opt 2016-06-30 14:36:08 -07:00
712b9a3489 Asm only for avx512 2016-06-30 14:35:02 -07:00
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
1445189361 COntrol the prefetch strategy 2016-06-30 14:35:02 -07:00
05c884a62a Prefetch change 2016-06-30 14:35:01 -07:00
a25bec87d9 Prefetch during save 2016-06-30 14:35:01 -07:00
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
c8b35d960c Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt 2016-06-30 14:30:49 -07:00
532f41dd61 Asm only for avx512 2016-06-30 14:00:34 -07:00
661b0ab45d Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 13:07:42 -07:00
4bc08ed995 Improved the prefetching when using cache blocking codes 2016-06-26 12:54:14 -07:00
b2933a0557 COntrol the prefetch strategy 2016-06-25 12:55:25 -07:00