paboyle
6049d5ac47
Update
2016-07-15 00:08:32 +01:00
paboyle
35d0d35238
Updated file list
2016-07-15 00:02:53 +01:00
paboyle
c0e878705e
Updated file list
2016-07-15 00:02:39 +01:00
paboyle
5c0c8efb9e
Updated file list
2016-07-15 00:02:11 +01:00
paboyle
dfd714e1ef
Multiple implementations for the 5d hopping terms, depending on cache friendly
...
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62
Rewrite for performance. Impl dependent instantiations give
...
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2
5d ls vec rename of impl class
2016-07-14 23:57:26 +01:00
paboyle
a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00
paboyle
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
paboyle
8a79e93cc2
Rename the 5d domain wall fermion vectorised Ls impl class
2016-07-14 23:53:00 +01:00
paboyle
3493b51879
Modest updates
2016-07-14 23:52:13 +01:00
paboyle
de3e79d300
red black for Ls vectorised is 4d red black. Update accordingly now I've made this choice
2016-07-14 23:49:42 +01:00
paboyle
dd62a61c5c
Added broadcast and rotation of simd vectors
2016-07-14 23:49:00 +01:00
paboyle
8f47d0b5ab
Rotation needed for hopping term in fifth dim with Ls vectorised fields
2016-07-14 23:45:36 +01:00
paboyle
42af132dab
Fix for chris kellys request to peek poke on checkerboarded fields
2016-07-14 23:44:48 +01:00
paboyle
9db2c6525d
updating benchmarks for red black 4d for Ls vectorised code
2016-07-14 23:44:02 +01:00
paboyle
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
paboyle
62601bb649
Bug fix
2016-07-08 20:46:29 +01:00
paboyle
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
paboyle
c667d9fdcc
Trying to make compile clean on travis; seem to have a make -j 4 problem with fftw
2016-07-07 23:26:39 +01:00
paboyle
7dbb94bab2
Update
2016-07-07 22:51:37 +01:00
paboyle
236dcc820b
typo fix
2016-07-07 22:46:11 +01:00
paboyle
a42a441a6a
Rename the reconfigure script to ./autogen.sh
2016-07-07 22:35:45 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
paboyle
fc4a043663
Colors and banner clean up
2016-07-02 16:15:38 +01:00
paboyle
61ba50665e
Merge branch 'hotfix/v0.5.1' into develop
2016-07-01 16:34:30 +01:00
paboyle
bfe14000a9
Double compile fix
2016-07-01 16:33:51 +01:00
paboyle
1ceff48133
Merge branch 'release/v0.5.0' into develop
2016-06-30 15:15:59 -07:00
paboyle
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
paboyle
3fc6e03ad1
Version file
2016-06-30 14:44:09 -07:00
paboyle
2d6614f3a1
Merge branch 'feature/knl-cache-opt' into develop
2016-06-30 14:36:20 -07:00
paboyle
4e041b5103
Merge branch 'feature/knl-cache-opt' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:36:08 -07:00
paboyle
712b9a3489
Asm only for avx512
2016-06-30 14:35:02 -07:00
paboyle
bdaa5b1767
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a
Improved the prefetching when using cache blocking codes
2016-06-30 14:35:02 -07:00
paboyle
1445189361
COntrol the prefetch strategy
2016-06-30 14:35:02 -07:00
paboyle
05c884a62a
Prefetch change
2016-06-30 14:35:01 -07:00
paboyle
a25bec87d9
Prefetch during save
2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
paboyle
51cb2d4328
update file lists
2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle
c8b35d960c
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:30:49 -07:00
paboyle
532f41dd61
Asm only for avx512
2016-06-30 14:00:34 -07:00
paboyle
661b0ab45d
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 13:07:42 -07:00
paboyle
4bc08ed995
Improved the prefetching when using cache blocking codes
2016-06-26 12:54:14 -07:00
paboyle
b2933a0557
COntrol the prefetch strategy
2016-06-25 12:55:25 -07:00
paboyle
db057cc276
Prefetch change
2016-06-25 12:54:50 -07:00
paboyle
22e88eaf54
Prefetch during save
2016-06-25 12:54:14 -07:00
paboyle
09fe3caebd
Tweaks
2016-06-25 11:08:05 -07:00