b6597b74e7
Added support for the Two index Symmetric and Antisymmetric representations
...
Tested for HMC convergence: OK
Added also a test file showing an example for mixed representations
2016-09-22 14:17:37 +01:00
b9c80318a2
Merge branch 'develop' into feature/hirep
2016-09-13 10:01:51 +01:00
f76f281e58
Cleaning files after fix
2016-09-09 11:34:25 +01:00
aa20cc8b52
Fixing compilation error with AVX512 flag
2016-09-09 02:58:52 -07:00
0fd179fb33
Merge branch 'develop' into feature/hirep
2016-09-01 12:59:53 +01:00
b573d1f35a
Wilson tree level added
2016-08-31 00:27:04 +01:00
0c1d7e4daf
Mom space prop for Wilson action
2016-08-31 00:26:36 +01:00
02e983a0cd
Momentum space prop and free prop convolution
2016-08-31 00:26:02 +01:00
fd5614738d
Merge branch 'develop' into feature/hirep
2016-08-30 18:21:36 +01:00
4ab7dbfd57
Instantiate
2016-08-15 23:00:40 +01:00
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
089f0ab582
Debugged HMC for Creutz relation
2016-07-28 16:44:41 +01:00
b93e18ed50
Modified the Dirac Kernel class to compile with different number of colours
...
Added the general push_back functionality to accomodate for all defined representations
Compiles, not tested
2016-07-18 16:36:28 +01:00
9c77bb69a5
Added all elements for Hirep HMC
...
TODO: Test and debug
2016-07-18 12:05:23 +01:00
fad5c675eb
sign error on the 4d gparity force
2016-07-16 01:51:56 +01:00
f4dd5062d7
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-07-15 19:26:06 +01:00
980ff18956
Solving the instantiation no compile issue
2016-07-15 17:19:44 +01:00
1a6c7204ac
Disable instantiation; Use cache version instead
2016-07-15 00:34:39 +01:00
dfd714e1ef
Multiple implementations for the 5d hopping terms, depending on cache friendly
...
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
79a8ca1a62
Rewrite for performance. Impl dependent instantiations give
...
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
8a79e93cc2
Rename the 5d domain wall fermion vectorised Ls impl class
2016-07-14 23:53:00 +01:00
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
a9ae30f868
Added representations definitions for the HMC
2016-07-12 13:36:10 +01:00
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
fbf96b1bbb
]Merge branch 'develop' into feature/hirep
2016-07-07 14:20:10 +01:00
ffedeb1c58
Minor modifications
2016-07-06 11:41:27 +01:00
fdfbf11c6d
Merge branch 'develop' into temporary-smearing
2016-07-04 18:45:10 +01:00
9cb90f714e
Merge remote-tracking branch 'origin/develop' into temporary-smearing
2016-07-04 17:28:40 +01:00
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
712b9a3489
Asm only for avx512
2016-06-30 14:35:02 -07:00
bdaa5b1767
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 14:35:02 -07:00
8fcefc021a
Improved the prefetching when using cache blocking codes
2016-06-30 14:35:02 -07:00
05c884a62a
Prefetch change
2016-06-30 14:35:01 -07:00
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
565e9329ba
Changed the colouring classes
2016-06-30 16:51:03 +01:00
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
87418e7df1
Slightly faster prefetching perf.
2016-06-13 02:32:52 -07:00
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
d9408893b3
Prefetching in the normal kernel implementation.
2016-06-08 05:43:48 -07:00
8ac021de73
Added a test an fixed it for red black precon Ls innermost vectorised DWF
2016-06-07 13:16:56 -07:00
e503ef5590
Cleaned up
2016-06-07 00:11:36 +01:00
a7682b0060
Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS
2016-06-06 23:48:21 +01:00
53d06046b0
Compiling updates for KNL
2016-06-03 03:47:54 -07:00
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
5341977948
IMCI fixes. Thought I had committed these. The "real" disambiguation
...
between std::real and Grid::real shouldn't have been necessary and I don't
know why only the icpc v16.0 on babbage hits it.
May need a longer term rename of Grid::real or some careful EnableIf work.
2016-04-30 03:34:16 -07:00