portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-04-26 21:46:00 +01:00

Author	SHA1	Message	Date
paboyle	3619167d62	Mass parameter	2016-10-10 23:47:33 +01:00
paboyle	96f1d1b828	Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass).	2016-10-10 23:46:45 +01:00
paboyle	657e0a8f4d	Mass parameter	2016-10-10 23:46:10 +01:00
paboyle	616e7cd83e	Mass parameter	2016-10-10 23:45:48 +01:00
paboyle	6f26d2e8d4	Overlap tree level feynman rule	2016-10-10 23:45:18 +01:00
paboyle	c014574504	A "please implement me" feynman rule. If this were abstract virtual it would require/force implementation	2016-10-10 23:44:00 +01:00
paboyle	d7ce164e6e	Feynman rule for DWF	2016-10-10 23:43:36 +01:00
paboyle	c0d5b99016	Dminus	2016-10-10 23:43:19 +01:00
paboyle	09ca32d678	Dminus added for Cayley	2016-10-10 23:42:55 +01:00
paboyle	7089b6d5a5	Setting up but not implemented some QED rules	2016-09-26 09:43:40 +01:00
paboyle	b6713ecb60	Momentum space rules for Overlap, DWF untested to date	2016-09-26 09:39:09 +01:00
paboyle	b573d1f35a	Wilson tree level added	2016-08-31 00:27:04 +01:00
paboyle	0c1d7e4daf	Mom space prop for Wilson action	2016-08-31 00:26:36 +01:00
paboyle	02e983a0cd	Momentum space prop and free prop convolution	2016-08-31 00:26:02 +01:00
paboyle	4ab7dbfd57	Instantiate	2016-08-15 23:00:40 +01:00
paboyle	90e70790f3	Feature for z-Mobius prep	2016-08-15 22:31:29 +01:00
paboyle	fad5c675eb	sign error on the 4d gparity force	2016-07-16 01:51:56 +01:00
paboyle	f4dd5062d7	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2016-07-15 19:26:06 +01:00
paboyle	980ff18956	Solving the instantiation no compile issue	2016-07-15 17:19:44 +01:00
paboyle	1a6c7204ac	Disable instantiation; Use cache version instead	2016-07-15 00:34:39 +01:00
paboyle	dfd714e1ef	Multiple implementations for the 5d hopping terms, depending on cache friendly ops and/or the 5th direction being vectorised All use 4d redblack.	2016-07-15 00:00:09 +01:00
paboyle	79a8ca1a62	Rewrite for performance. Impl dependent instantiations give 4d linalg impls of the 5d hopping terms (and inverse) Cache friendly loop orderings of the above Dense matrix stored and apply to the above -- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv and rotate/shift of the Mooee M5D routines.	2016-07-14 23:58:15 +01:00
paboyle	fb45eb2eb2	5d ls vec rename of impl class	2016-07-14 23:57:26 +01:00
paboyle	a307274c96	Fermion impl rename for ls vectorised 5d approaches	2016-07-14 23:56:13 +01:00
paboyle	3f2c44a5fe	Updating the class to 5d selection based on impl type	2016-07-14 23:55:26 +01:00
paboyle	48fb1cdc11	Update domain 5d vectorised impl type, move the type over to 4d redblack with the dense OO inverse	2016-07-14 23:54:35 +01:00
paboyle	8a79e93cc2	Rename the 5d domain wall fermion vectorised Ls impl class	2016-07-14 23:53:00 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
Guido Cossu	ffb8b3116c	Tested smeared RHMC Wilson1p1, accepting	2016-07-07 11:49:36 +01:00
Guido Cossu	3e80947c2b	Cleaned up HMC output. Tested smeared HMCs for single precision (OK)	2016-07-05 12:03:54 +01:00
Guido Cossu	fdfbf11c6d	Merge branch 'develop' into temporary-smearing	2016-07-04 18:45:10 +01:00
Guido Cossu	9cb90f714e	Merge remote-tracking branch 'origin/develop' into temporary-smearing	2016-07-04 17:28:40 +01:00
Guido Cossu	2daffdf95d	Tested smeared WilsonRatio action, accepts	2016-07-04 16:17:28 +01:00
Guido Cossu	149f826601	Tested smearing for Nf2 WilsonFermionAction, non EO: accepts	2016-07-04 16:09:19 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	565e9329ba	Changed the colouring classes	2016-06-30 16:51:03 +01:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	e503ef5590	Cleaned up	2016-06-07 00:11:36 +01:00

1 2 3 4 5 ...

305 Commits