adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
9dc345e8e8
Debugged smearing and adding HMC functions for hirep
2016-07-13 17:51:18 +01:00
6f47fbb1e2
Disabled parallel for loops in ExtractSlice and InsertSlice due to race conditions. Likely will need to do so for localConvert too.
2016-07-13 10:49:18 -04:00
a9ae30f868
Added representations definitions for the HMC
2016-07-12 13:36:10 +01:00
a3c0fb79b6
Fix to iVector and iMatrix pokeIndex and checkerboard local site indexing.
2016-07-11 17:15:22 -04:00
62601bb649
Bug fix
2016-07-08 20:46:29 +01:00
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
daea5297ee
Wrote the projector in the adjoint representation algebra
2016-07-08 16:14:16 +01:00
5028969d4b
Added generators for the adjoint representation
2016-07-08 15:40:11 +01:00
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
c5106d0c03
Bugfix
2016-07-07 16:06:30 -04:00
fbf96b1bbb
]Merge branch 'develop' into feature/hirep
2016-07-07 14:20:10 +01:00
3c49ddfaa4
Merge branch 'temporary-smearing' into develop
2016-07-07 14:04:59 +01:00
ffb8b3116c
Tested smeared RHMC Wilson1p1, accepting
2016-07-07 11:49:36 +01:00
dd8cfff111
Another fix for pedantic compilers
2016-07-06 18:22:15 -04:00
184642adb0
Fix for pedantic compilers
2016-07-06 18:15:15 -04:00
4774a3bcd2
Generalized HotConfiguration and functions it calls to accept gauge fields with precision other than the default.
2016-07-06 18:01:08 -04:00
25fafa9a89
Comment
2016-07-06 16:19:41 -04:00
85ed8175cb
Implemented mixed precision CG. Fixed filelist to exclude lib/Old directory and include Config.h.
2016-07-06 15:57:04 -04:00
df5c788ef2
Merge branch 'develop' into feature/multi_prec
2016-07-06 14:52:28 -04:00
15f22425c8
Added option to prevent CG from exiting when it fails to converge
2016-07-06 14:50:01 -04:00
e87182cf98
Debugged the copy constructor of the Lattice class
2016-07-06 15:31:00 +01:00
e3d5319470
Debugged the real() and imag() functions and added tests to Test_Simd
2016-07-06 14:16:03 +01:00
ffedeb1c58
Minor modifications
2016-07-06 11:41:27 +01:00
3e3b367aa9
Small changes in the Log files
2016-07-05 15:05:28 +01:00
3e80947c2b
Cleaned up HMC output. Tested smeared HMCs for single precision (OK)
2016-07-05 12:03:54 +01:00
fdfbf11c6d
Merge branch 'develop' into temporary-smearing
2016-07-04 18:45:10 +01:00
9cb90f714e
Merge remote-tracking branch 'origin/develop' into temporary-smearing
2016-07-04 17:28:40 +01:00
2daffdf95d
Tested smeared WilsonRatio action, accepts
2016-07-04 16:17:28 +01:00
149f826601
Tested smearing for Nf2 WilsonFermionAction, non EO: accepts
2016-07-04 16:09:19 +01:00
cd8ee27080
Simple change in iGamma for smearing
2016-07-04 16:02:57 +01:00
0fa66e8f3c
Debugged smearing for EOWilson, accepts
2016-07-04 15:35:37 +01:00
8dd099267d
Corrected a bug in the Expression Templates (acso and asin were wrong)
2016-07-03 12:28:25 +01:00
1a6d65c6a4
Converted set_uw and set_fj to all complex functions
2016-07-03 10:27:43 +01:00
fc4a043663
Colors and banner clean up
2016-07-02 16:15:38 +01:00
092fa0d8da
Debugged set_fj,
...
to be fixed: BUG in imag()
2016-07-01 16:06:20 +01:00
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
712b9a3489
Asm only for avx512
2016-06-30 14:35:02 -07:00
bdaa5b1767
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 14:35:02 -07:00
8fcefc021a
Improved the prefetching when using cache blocking codes
2016-06-30 14:35:02 -07:00
1445189361
COntrol the prefetch strategy
2016-06-30 14:35:02 -07:00
05c884a62a
Prefetch change
2016-06-30 14:35:01 -07:00
a25bec87d9
Prefetch during save
2016-06-30 14:35:01 -07:00
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
51cb2d4328
update file lists
2016-06-30 14:35:01 -07:00
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
565e9329ba
Changed the colouring classes
2016-06-30 16:51:03 +01:00
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
86187d7cca
Removed write to stdout in constructor for MPI CartesianCommunicator
2016-06-14 15:34:20 +01:00
87418e7df1
Slightly faster prefetching perf.
2016-06-13 02:32:52 -07:00