1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-26 01:29:34 +00:00
Commit Graph

557 Commits

Author SHA1 Message Date
Guido Cossu
f45ef8d114 Minor modification in ActionBase.h 2016-09-01 11:46:46 +01:00
paboyle
b573d1f35a Wilson tree level added 2016-08-31 00:27:04 +01:00
paboyle
0c1d7e4daf Mom space prop for Wilson action 2016-08-31 00:26:36 +01:00
paboyle
02e983a0cd Momentum space prop and free prop convolution 2016-08-31 00:26:02 +01:00
Guido Cossu
fd5614738d Merge branch 'develop' into feature/hirep 2016-08-30 18:21:36 +01:00
Guido Cossu
b512ccbee6 HMC for Adjoint fermions works
Accepts and reproduces known results

Check initial instability of inverters
when starting from hot configurations
2016-08-30 11:31:25 +01:00
paboyle
4ab7dbfd57 Instantiate 2016-08-15 23:00:40 +01:00
paboyle
90e70790f3 Feature for z-Mobius prep 2016-08-15 22:31:29 +01:00
Guido Cossu
147e2025b9 Added unit tests on the representation transformations
Status: Passing all tests
2016-08-08 16:54:22 +01:00
Guido Cossu
49b5c49851 Checked the hermiticity of the op in derivative, ok
Still CG fails to converge
2016-07-31 12:37:33 +01:00
Guido Cossu
089f0ab582 Debugged HMC for Creutz relation 2016-07-28 16:44:41 +01:00
Guido Cossu
b93e18ed50 Modified the Dirac Kernel class to compile with different number of colours
Added the general push_back functionality to accomodate for all defined representations

Compiles, not tested
2016-07-18 16:36:28 +01:00
Guido Cossu
9c77bb69a5 Added all elements for Hirep HMC
TODO: Test and debug
2016-07-18 12:05:23 +01:00
paboyle
fad5c675eb sign error on the 4d gparity force 2016-07-16 01:51:56 +01:00
paboyle
f4dd5062d7 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-07-15 19:26:06 +01:00
paboyle
980ff18956 Solving the instantiation no compile issue 2016-07-15 17:19:44 +01:00
Guido Cossu
7edf4c6c04 Added HMC utitities for the higher representations
TODO: Inherit types for the pseudofermions, Debugging, testing
2016-07-15 13:39:47 +01:00
paboyle
1a6c7204ac Disable instantiation; Use cache version instead 2016-07-15 00:34:39 +01:00
paboyle
dfd714e1ef Multiple implementations for the 5d hopping terms, depending on cache friendly
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62 Rewrite for performance. Impl dependent instantiations give
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above

-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
   and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2 5d ls vec rename of impl class 2016-07-14 23:57:26 +01:00
paboyle
a307274c96 Fermion impl rename for ls vectorised 5d approaches 2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe Updating the class to 5d selection based on impl type 2016-07-14 23:55:26 +01:00
paboyle
48fb1cdc11 Update domain 5d vectorised impl type, move the type over to 4d redblack with
the dense OO inverse
2016-07-14 23:54:35 +01:00
paboyle
8a79e93cc2 Rename the 5d domain wall fermion vectorised Ls impl class 2016-07-14 23:53:00 +01:00
paboyle
adbc7c1188 Adding files for multiple implementations (cache opt) and Ls vectorisation
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.

The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.

This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.

Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
Guido Cossu
9dc345e8e8 Debugged smearing and adding HMC functions for hirep 2016-07-13 17:51:18 +01:00
Guido Cossu
a9ae30f868 Added representations definitions for the HMC 2016-07-12 13:36:10 +01:00
paboyle
ef97e32152 Adding persistent communicators 2016-07-08 17:16:08 +01:00
paboyle
a0676beeb1 Open up dependency on Eigen and FFTW 2016-07-07 22:31:07 +01:00
Guido Cossu
fbf96b1bbb ]Merge branch 'develop' into feature/hirep 2016-07-07 14:20:10 +01:00
Guido Cossu
ffb8b3116c Tested smeared RHMC Wilson1p1, accepting 2016-07-07 11:49:36 +01:00
Guido Cossu
ffedeb1c58 Minor modifications 2016-07-06 11:41:27 +01:00
Guido Cossu
3e80947c2b Cleaned up HMC output. Tested smeared HMCs for single precision (OK) 2016-07-05 12:03:54 +01:00
Guido Cossu
fdfbf11c6d Merge branch 'develop' into temporary-smearing 2016-07-04 18:45:10 +01:00
Guido Cossu
9cb90f714e Merge remote-tracking branch 'origin/develop' into temporary-smearing 2016-07-04 17:28:40 +01:00
Guido Cossu
2daffdf95d Tested smeared WilsonRatio action, accepts 2016-07-04 16:17:28 +01:00
Guido Cossu
149f826601 Tested smearing for Nf2 WilsonFermionAction, non EO: accepts 2016-07-04 16:09:19 +01:00
paboyle
680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
paboyle
712b9a3489 Asm only for avx512 2016-06-30 14:35:02 -07:00
paboyle
bdaa5b1767 Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking. 2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a Improved the prefetching when using cache blocking codes 2016-06-30 14:35:02 -07:00
paboyle
05c884a62a Prefetch change 2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu
565e9329ba Changed the colouring classes 2016-06-30 16:51:03 +01:00
Guido Cossu
5e02392f9c Fixed compilation error for benchmark_dwf
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle
87418e7df1 Slightly faster prefetching perf. 2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3 Prefetching in the normal kernel implementation. 2016-06-08 05:43:48 -07:00