980ff18956
Solving the instantiation no compile issue
2016-07-15 17:19:44 +01:00
1a6c7204ac
Disable instantiation; Use cache version instead
2016-07-15 00:34:39 +01:00
49310fbab3
Done with red black change over
2016-07-15 00:08:43 +01:00
6049d5ac47
Update
2016-07-15 00:08:32 +01:00
35d0d35238
Updated file list
2016-07-15 00:02:53 +01:00
c0e878705e
Updated file list
2016-07-15 00:02:39 +01:00
5c0c8efb9e
Updated file list
2016-07-15 00:02:11 +01:00
dfd714e1ef
Multiple implementations for the 5d hopping terms, depending on cache friendly
...
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
79a8ca1a62
Rewrite for performance. Impl dependent instantiations give
...
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
fb45eb2eb2
5d ls vec rename of impl class
2016-07-14 23:57:26 +01:00
a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
8a79e93cc2
Rename the 5d domain wall fermion vectorised Ls impl class
2016-07-14 23:53:00 +01:00
3493b51879
Modest updates
2016-07-14 23:52:13 +01:00
de3e79d300
red black for Ls vectorised is 4d red black. Update accordingly now I've made this choice
2016-07-14 23:49:42 +01:00
dd62a61c5c
Added broadcast and rotation of simd vectors
2016-07-14 23:49:00 +01:00
8f47d0b5ab
Rotation needed for hopping term in fifth dim with Ls vectorised fields
2016-07-14 23:45:36 +01:00
42af132dab
Fix for chris kellys request to peek poke on checkerboarded fields
2016-07-14 23:44:48 +01:00
9db2c6525d
updating benchmarks for red black 4d for Ls vectorised code
2016-07-14 23:44:02 +01:00
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
62601bb649
Bug fix
2016-07-08 20:46:29 +01:00
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
c667d9fdcc
Trying to make compile clean on travis; seem to have a make -j 4 problem with fftw
2016-07-07 23:26:39 +01:00
7dbb94bab2
Update
2016-07-07 22:51:37 +01:00
236dcc820b
typo fix
2016-07-07 22:46:11 +01:00
a42a441a6a
Rename the reconfigure script to ./autogen.sh
2016-07-07 22:35:45 +01:00
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
fc4a043663
Colors and banner clean up
2016-07-02 16:15:38 +01:00
61ba50665e
Merge branch 'hotfix/v0.5.1' into develop
2016-07-01 16:34:30 +01:00
bfe14000a9
Double compile fix
2016-07-01 16:33:51 +01:00
1ceff48133
Merge branch 'release/v0.5.0' into develop
2016-06-30 15:15:59 -07:00
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
3fc6e03ad1
Version file
v0.5.0
2016-06-30 14:44:09 -07:00
2d6614f3a1
Merge branch 'feature/knl-cache-opt' into develop
2016-06-30 14:36:20 -07:00
4e041b5103
Merge branch 'feature/knl-cache-opt' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:36:08 -07:00
712b9a3489
Asm only for avx512
2016-06-30 14:35:02 -07:00
bdaa5b1767
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 14:35:02 -07:00
8fcefc021a
Improved the prefetching when using cache blocking codes
2016-06-30 14:35:02 -07:00
1445189361
COntrol the prefetch strategy
2016-06-30 14:35:02 -07:00
05c884a62a
Prefetch change
2016-06-30 14:35:01 -07:00
a25bec87d9
Prefetch during save
2016-06-30 14:35:01 -07:00
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
51cb2d4328
update file lists
2016-06-30 14:35:01 -07:00
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
c8b35d960c
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:30:49 -07:00
532f41dd61
Asm only for avx512
2016-06-30 14:00:34 -07:00
661b0ab45d
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 13:07:42 -07:00
4bc08ed995
Improved the prefetching when using cache blocking codes
2016-06-26 12:54:14 -07:00
b2933a0557
COntrol the prefetch strategy
2016-06-25 12:55:25 -07:00