portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-05-11 12:44:31 +01:00

Author	SHA1	Message	Date
paboyle	a307274c96	Fermion impl rename for ls vectorised 5d approaches	2016-07-14 23:56:13 +01:00
paboyle	3f2c44a5fe	Updating the class to 5d selection based on impl type	2016-07-14 23:55:26 +01:00
paboyle	48fb1cdc11	Update domain 5d vectorised impl type, move the type over to 4d redblack with the dense OO inverse	2016-07-14 23:54:35 +01:00
paboyle	8a79e93cc2	Rename the 5d domain wall fermion vectorised Ls impl class	2016-07-14 23:53:00 +01:00
paboyle	dd62a61c5c	Added broadcast and rotation of simd vectors	2016-07-14 23:49:00 +01:00
paboyle	8f47d0b5ab	Rotation needed for hopping term in fifth dim with Ls vectorised fields	2016-07-14 23:45:36 +01:00
paboyle	42af132dab	Fix for chris kellys request to peek poke on checkerboarded fields	2016-07-14 23:44:48 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00
paboyle	62601bb649	Bug fix	2016-07-08 20:46:29 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
paboyle	fc4a043663	Colors and banner clean up	2016-07-02 16:15:38 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	1445189361	COntrol the prefetch strategy	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	a25bec87d9	Prefetch during save	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	51cb2d4328	update file lists	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
Richard Rollins	86187d7cca	Removed write to stdout in constructor for MPI CartesianCommunicator	2016-06-14 15:34:20 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	e503ef5590	Cleaned up	2016-06-07 00:11:36 +01:00
paboyle	a7682b0060	Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS	2016-06-06 23:48:21 +01:00
paboyle	d4c9d71fc8	Merge branch 'master' of https://github.com/paboyle/Grid	2016-06-06 07:06:54 -07:00
paboyle	786ca52c43	Problems remain in the red black preconditioning of the Ls vectorisation	2016-06-06 07:05:51 -07:00
Peter Boyle	f78d89bcbe	Update Lebesgue.cc kill verbose	2016-06-03 13:33:42 +01:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
portelli	1c0e922585	Merge pull request #35 from aportelli/master empty SIMD fix	2016-05-27 16:49:13 +01:00
portelli	9d5f693cbe	empty SIMD fix	2016-05-24 10:56:27 +01:00
Peter Boyle	5c90c3b457	Merge pull request #34 from aportelli/master Polymorphic lattices & various small updates	2016-05-24 10:50:04 +01:00
portelli	91e04056f9	fix of the empty SIMD	2016-05-12 19:24:10 +01:00
portelli	3789e3f31c	additional fixed in slice functions	2016-05-12 18:35:38 +01:00
portelli	0c66719210	const fix in slice functions	2016-05-12 13:01:35 +01:00
paboyle	3a5b5c8bec	Save an old tar of tree	2016-05-12 03:20:17 -07:00
portelli	4bc21ec7cb	thread CL argument fix	2016-05-11 15:21:29 +01:00
portelli	e3083b6dfc	Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab'	2016-05-11 15:20:41 +01:00
paboyle	ab89418658	Precision change going in; useful for mixed precision algorithms for example.	2016-05-11 15:18:47 +01:00
paboyle	28cd99882c	Subslicing	2016-05-11 15:06:54 +01:00
paboyle	aceaee774c	ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not distributed in the orthogonal direction. Useful for fermion 4d/5d etc..	2016-05-11 14:12:02 +01:00
portelli	101aa769eb	LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers	2016-05-04 12:15:31 -07:00
portelli	0bf99bfde5	log polish	2016-05-04 12:14:49 -07:00
portelli	64bf6fe54e	macro to dump NERSC header to a stream	2016-05-04 12:14:38 -07:00

1 2 3 4 5 ...

1201 Commits