portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-06-05 11:44:37 +01:00

Author	SHA1	Message	Date
paboyle	a307274c96	Fermion impl rename for ls vectorised 5d approaches	2016-07-14 23:56:13 +01:00
paboyle	3f2c44a5fe	Updating the class to 5d selection based on impl type	2016-07-14 23:55:26 +01:00
paboyle	48fb1cdc11	Update domain 5d vectorised impl type, move the type over to 4d redblack with the dense OO inverse	2016-07-14 23:54:35 +01:00
paboyle	8a79e93cc2	Rename the 5d domain wall fermion vectorised Ls impl class	2016-07-14 23:53:00 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00
Guido Cossu	9dc345e8e8	Debugged smearing and adding HMC functions for hirep	2016-07-13 17:51:18 +01:00
Guido Cossu	a9ae30f868	Added representations definitions for the HMC	2016-07-12 13:36:10 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
Guido Cossu	fbf96b1bbb	]Merge branch 'develop' into feature/hirep	2016-07-07 14:20:10 +01:00
Guido Cossu	ffb8b3116c	Tested smeared RHMC Wilson1p1, accepting	2016-07-07 11:49:36 +01:00
Guido Cossu	ffedeb1c58	Minor modifications	2016-07-06 11:41:27 +01:00
Guido Cossu	3e80947c2b	Cleaned up HMC output. Tested smeared HMCs for single precision (OK)	2016-07-05 12:03:54 +01:00
Guido Cossu	fdfbf11c6d	Merge branch 'develop' into temporary-smearing	2016-07-04 18:45:10 +01:00
Guido Cossu	9cb90f714e	Merge remote-tracking branch 'origin/develop' into temporary-smearing	2016-07-04 17:28:40 +01:00
Guido Cossu	2daffdf95d	Tested smeared WilsonRatio action, accepts	2016-07-04 16:17:28 +01:00
Guido Cossu	149f826601	Tested smearing for Nf2 WilsonFermionAction, non EO: accepts	2016-07-04 16:09:19 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	565e9329ba	Changed the colouring classes	2016-06-30 16:51:03 +01:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	e503ef5590	Cleaned up	2016-06-07 00:11:36 +01:00
paboyle	a7682b0060	Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS	2016-06-06 23:48:21 +01:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
paboyle	5341977948	IMCI fixes. Thought I had committed these. The "real" disambiguation between std::real and Grid::real shouldn't have been necessary and I don't know why only the icpc v16.0 on babbage hits it. May need a longer term rename of Grid::real or some careful EnableIf work.	2016-04-30 03:34:16 -07:00
paboyle	1e554350ac	The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.	2016-04-29 16:49:18 -07:00
paboyle	c79ea0dcef	Fixingn IMCI	2016-04-22 21:52:54 -07:00
paboyle	ba427abde9	simd 5d	2016-04-19 15:38:39 -07:00
paboyle	9b6ab6db16	simd in 5th dimension support	2016-04-19 15:38:01 -07:00
paboyle	806a83d38b	simd in fifth dim support for dwf	2016-04-19 15:36:19 -07:00
neo	339be37dba	Debugging smeared HMC	2016-04-13 17:00:14 +09:00
paboyle	b1192a8908	Benchmark_zmm added	2016-04-06 03:00:07 -07:00
paboyle	e8dddb1596	Adding extra benchmark	2016-04-06 10:32:54 +01:00
coppolachan	97d0d56bcb	Debugging Smearing routines (set_fj)	2016-04-06 17:58:43 +09:00
coppolachan	7c7ea35ffb	Putting the Traceless Antihermitian part outside the deriv in pseudofermion actions	2016-04-05 16:28:09 +09:00
coppolachan	4b1cf580e0	Debugging the Smearing routines	2016-04-05 16:19:30 +09:00
paboyle	e67fc2be18	Adding a trial for openmp overhead minimisation	2016-03-31 16:00:37 +01:00
paboyle	8052556275	Cleaning up the single/double kernel implementation switch	2016-03-31 14:51:32 +01:00
paboyle	60d965f79e	AVX512 improvements; sigfpe trapping too	2016-03-30 08:42:34 +01:00
paboyle	1ecbf9794d	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-30 08:37:55 +01:00

1 2 3 4 5 ...

336 Commits