portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-18 08:03:27 +01:00

Author	SHA1	Message	Date
paboyle	b573d1f35a	Wilson tree level added	2016-08-31 00:27:04 +01:00
paboyle	0c1d7e4daf	Mom space prop for Wilson action	2016-08-31 00:26:36 +01:00
paboyle	02e983a0cd	Momentum space prop and free prop convolution	2016-08-31 00:26:02 +01:00
Guido Cossu	fd5614738d	Merge branch 'develop' into feature/hirep	2016-08-30 18:21:36 +01:00
Guido Cossu	b512ccbee6	HMC for Adjoint fermions works Accepts and reproduces known results Check initial instability of inverters when starting from hot configurations	2016-08-30 11:31:25 +01:00
paboyle	4ab7dbfd57	Instantiate	2016-08-15 23:00:40 +01:00
paboyle	90e70790f3	Feature for z-Mobius prep	2016-08-15 22:31:29 +01:00
Guido Cossu	147e2025b9	Added unit tests on the representation transformations Status: Passing all tests	2016-08-08 16:54:22 +01:00
Guido Cossu	49b5c49851	Checked the hermiticity of the op in derivative, ok Still CG fails to converge	2016-07-31 12:37:33 +01:00
Guido Cossu	089f0ab582	Debugged HMC for Creutz relation	2016-07-28 16:44:41 +01:00
Guido Cossu	b93e18ed50	Modified the Dirac Kernel class to compile with different number of colours Added the general push_back functionality to accomodate for all defined representations Compiles, not tested	2016-07-18 16:36:28 +01:00
Guido Cossu	9c77bb69a5	Added all elements for Hirep HMC TODO: Test and debug	2016-07-18 12:05:23 +01:00
paboyle	fad5c675eb	sign error on the 4d gparity force	2016-07-16 01:51:56 +01:00
paboyle	f4dd5062d7	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2016-07-15 19:26:06 +01:00
paboyle	980ff18956	Solving the instantiation no compile issue	2016-07-15 17:19:44 +01:00
Guido Cossu	7edf4c6c04	Added HMC utitities for the higher representations TODO: Inherit types for the pseudofermions, Debugging, testing	2016-07-15 13:39:47 +01:00
paboyle	1a6c7204ac	Disable instantiation; Use cache version instead	2016-07-15 00:34:39 +01:00
paboyle	dfd714e1ef	Multiple implementations for the 5d hopping terms, depending on cache friendly ops and/or the 5th direction being vectorised All use 4d redblack.	2016-07-15 00:00:09 +01:00
paboyle	79a8ca1a62	Rewrite for performance. Impl dependent instantiations give 4d linalg impls of the 5d hopping terms (and inverse) Cache friendly loop orderings of the above Dense matrix stored and apply to the above -- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv and rotate/shift of the Mooee M5D routines.	2016-07-14 23:58:15 +01:00
paboyle	fb45eb2eb2	5d ls vec rename of impl class	2016-07-14 23:57:26 +01:00
paboyle	a307274c96	Fermion impl rename for ls vectorised 5d approaches	2016-07-14 23:56:13 +01:00
paboyle	3f2c44a5fe	Updating the class to 5d selection based on impl type	2016-07-14 23:55:26 +01:00
paboyle	48fb1cdc11	Update domain 5d vectorised impl type, move the type over to 4d redblack with the dense OO inverse	2016-07-14 23:54:35 +01:00
paboyle	8a79e93cc2	Rename the 5d domain wall fermion vectorised Ls impl class	2016-07-14 23:53:00 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00
Guido Cossu	9dc345e8e8	Debugged smearing and adding HMC functions for hirep	2016-07-13 17:51:18 +01:00
Guido Cossu	a9ae30f868	Added representations definitions for the HMC	2016-07-12 13:36:10 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
Guido Cossu	fbf96b1bbb	]Merge branch 'develop' into feature/hirep	2016-07-07 14:20:10 +01:00
Guido Cossu	ffb8b3116c	Tested smeared RHMC Wilson1p1, accepting	2016-07-07 11:49:36 +01:00
Guido Cossu	ffedeb1c58	Minor modifications	2016-07-06 11:41:27 +01:00
Guido Cossu	3e80947c2b	Cleaned up HMC output. Tested smeared HMCs for single precision (OK)	2016-07-05 12:03:54 +01:00
Guido Cossu	fdfbf11c6d	Merge branch 'develop' into temporary-smearing	2016-07-04 18:45:10 +01:00
Guido Cossu	9cb90f714e	Merge remote-tracking branch 'origin/develop' into temporary-smearing	2016-07-04 17:28:40 +01:00
Guido Cossu	2daffdf95d	Tested smeared WilsonRatio action, accepts	2016-07-04 16:17:28 +01:00
Guido Cossu	149f826601	Tested smearing for Nf2 WilsonFermionAction, non EO: accepts	2016-07-04 16:09:19 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	565e9329ba	Changed the colouring classes	2016-06-30 16:51:03 +01:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00

1 2 3 4 5 ...