portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-05-12 13:14:31 +01:00

Author	SHA1	Message	Date
paboyle	fb45eb2eb2	5d ls vec rename of impl class	2016-07-14 23:57:26 +01:00
paboyle	a307274c96	Fermion impl rename for ls vectorised 5d approaches	2016-07-14 23:56:13 +01:00
paboyle	3f2c44a5fe	Updating the class to 5d selection based on impl type	2016-07-14 23:55:26 +01:00
paboyle	48fb1cdc11	Update domain 5d vectorised impl type, move the type over to 4d redblack with the dense OO inverse	2016-07-14 23:54:35 +01:00
paboyle	8a79e93cc2	Rename the 5d domain wall fermion vectorised Ls impl class	2016-07-14 23:53:00 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	e503ef5590	Cleaned up	2016-06-07 00:11:36 +01:00
paboyle	a7682b0060	Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS	2016-06-06 23:48:21 +01:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
portelli	c698b16d75	function to generate Chroma-style gamma matrix products	2016-05-01 18:30:35 -07:00
paboyle	5341977948	IMCI fixes. Thought I had committed these. The "real" disambiguation between std::real and Grid::real shouldn't have been necessary and I don't know why only the icpc v16.0 on babbage hits it. May need a longer term rename of Grid::real or some careful EnableIf work.	2016-04-30 03:34:16 -07:00
portelli	f6c53e5039	Merge commit '1e554350acae0e67fa7177ed0db9d4f684a54af2'	2016-04-30 00:17:52 -07:00
portelli	6aa000176f	Fermion <-> Propagator functions	2016-04-30 00:14:33 -07:00
paboyle	1e554350ac	The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.	2016-04-29 16:49:18 -07:00
paboyle	c79ea0dcef	Fixingn IMCI	2016-04-22 21:52:54 -07:00
paboyle	8fd8bc25e9	simd 5th dim with rotation	2016-04-19 15:39:00 -07:00
paboyle	ba427abde9	simd 5d	2016-04-19 15:38:39 -07:00
paboyle	9b6ab6db16	simd in 5th dimension support	2016-04-19 15:38:01 -07:00
paboyle	806a83d38b	simd in fifth dim support for dwf	2016-04-19 15:36:19 -07:00
paboyle	b1192a8908	Benchmark_zmm added	2016-04-06 03:00:07 -07:00
paboyle	e8dddb1596	Adding extra benchmark	2016-04-06 10:32:54 +01:00
paboyle	e67fc2be18	Adding a trial for openmp overhead minimisation	2016-03-31 16:00:37 +01:00
paboyle	8052556275	Cleaning up the single/double kernel implementation switch	2016-03-31 14:51:32 +01:00
paboyle	60d965f79e	AVX512 improvements; sigfpe trapping too	2016-03-30 08:42:34 +01:00
paboyle	1ecbf9794d	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-30 08:37:55 +01:00
paboyle	c77b7ee897	AddSub based alternate SU3 routine	2016-03-28 17:55:22 -06:00
paboyle	1e355a51e1	Interface change	2016-03-27 23:46:55 -07:00
paboyle	21abaf7e91	Gamma sign change	2016-03-28 00:35:45 -06:00
paboyle	165bffc2e7	Avx512 changes for assembler kernels	2016-03-26 22:25:45 -06:00
paboyle	644fd6d32e	Build avx512 clean	2016-03-25 09:35:33 -07:00
paboyle	60d4564151	ICC no compile fix	2016-03-16 02:30:40 -07:00
paboyle	090e7aa930	Merge remote-tracking branch 'origin/chulwoo-dec12-2015' Merge Chulwoo's Lanczos related improvements. Merge Nd!=4 fixes for pure gauge HMC from Evan.	2016-03-08 09:55:14 +00:00
paboyle	325e745daa	Merge branch 'master' of https://github.com/paboyle/Grid	2016-03-02 07:04:03 -08:00
paboyle	61413565d0	Back off the inlined spin proj as not working	2016-03-02 07:03:09 -08:00
Antonin Portelli	497e7e4c53	BG/Q compatibility fix	2016-02-23 15:57:38 +00:00

1 2 3 4 5 ...

444 Commits