a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
8a79e93cc2
Rename the 5d domain wall fermion vectorised Ls impl class
2016-07-14 23:53:00 +01:00
3493b51879
Modest updates
2016-07-14 23:52:13 +01:00
de3e79d300
red black for Ls vectorised is 4d red black. Update accordingly now I've made this choice
2016-07-14 23:49:42 +01:00
dd62a61c5c
Added broadcast and rotation of simd vectors
2016-07-14 23:49:00 +01:00
8f47d0b5ab
Rotation needed for hopping term in fifth dim with Ls vectorised fields
2016-07-14 23:45:36 +01:00
42af132dab
Fix for chris kellys request to peek poke on checkerboarded fields
2016-07-14 23:44:48 +01:00
9db2c6525d
updating benchmarks for red black 4d for Ls vectorised code
2016-07-14 23:44:02 +01:00
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
62601bb649
Bug fix
2016-07-08 20:46:29 +01:00
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
c667d9fdcc
Trying to make compile clean on travis; seem to have a make -j 4 problem with fftw
2016-07-07 23:26:39 +01:00
7dbb94bab2
Update
2016-07-07 22:51:37 +01:00
236dcc820b
typo fix
2016-07-07 22:46:11 +01:00
a42a441a6a
Rename the reconfigure script to ./autogen.sh
2016-07-07 22:35:45 +01:00
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
fc4a043663
Colors and banner clean up
2016-07-02 16:15:38 +01:00
61ba50665e
Merge branch 'hotfix/v0.5.1' into develop
2016-07-01 16:34:30 +01:00
bfe14000a9
Double compile fix
2016-07-01 16:33:51 +01:00
1ceff48133
Merge branch 'release/v0.5.0' into develop
2016-06-30 15:15:59 -07:00
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
3fc6e03ad1
Version file
v0.5.0
2016-06-30 14:44:09 -07:00
2d6614f3a1
Merge branch 'feature/knl-cache-opt' into develop
2016-06-30 14:36:20 -07:00
4e041b5103
Merge branch 'feature/knl-cache-opt' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:36:08 -07:00
712b9a3489
Asm only for avx512
2016-06-30 14:35:02 -07:00
bdaa5b1767
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 14:35:02 -07:00
8fcefc021a
Improved the prefetching when using cache blocking codes
2016-06-30 14:35:02 -07:00
1445189361
COntrol the prefetch strategy
2016-06-30 14:35:02 -07:00
05c884a62a
Prefetch change
2016-06-30 14:35:01 -07:00
a25bec87d9
Prefetch during save
2016-06-30 14:35:01 -07:00
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
51cb2d4328
update file lists
2016-06-30 14:35:01 -07:00
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
c8b35d960c
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt
2016-06-30 14:30:49 -07:00
532f41dd61
Asm only for avx512
2016-06-30 14:00:34 -07:00
661b0ab45d
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 13:07:42 -07:00
4bc08ed995
Improved the prefetching when using cache blocking codes
2016-06-26 12:54:14 -07:00
b2933a0557
COntrol the prefetch strategy
2016-06-25 12:55:25 -07:00
db057cc276
Prefetch change
2016-06-25 12:54:50 -07:00
22e88eaf54
Prefetch during save
2016-06-25 12:54:14 -07:00
09fe3caebd
Tweaks
2016-06-25 11:08:05 -07:00
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
17a8f51a9b
update file lists
2016-06-19 11:59:10 -07:00
1b7f88dd00
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-19 11:45:58 -07:00
d6737e4bd8
Travis fix for Linux clang builds
2016-06-14 19:15:08 +01:00
d539888e57
Merge pull request #37 from rprollins/fix/mpi_communicator
...
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 17:25:40 +01:00
86187d7cca
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 15:34:20 +01:00
87418e7df1
Slightly faster prefetching perf.
2016-06-13 02:32:52 -07:00