portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 09:45:36 +00:00

Author	SHA1	Message	Date
Peter Boyle	4548523ecc	This modification eliminates what looks like a compiler bug on Intel 2017.	2018-03-08 04:41:16 -08:00
paboyle	3277bda130	View introduction to prepare for accelerator offload. Probably same problem exists for stencil object	2018-03-04 16:38:08 +00:00
paboyle	70e276e1ab	parallel_for elimination -> thread_loop	2018-01-28 01:01:14 +00:00
paboyle	c4f82e072b	_grid becomes private ; use Grid()§	2018-01-27 00:04:12 +00:00
paboyle	85771e97e9	Hide internal data	2018-01-26 23:04:46 +00:00
paboyle	19234fb40e	Namespace, format	2018-01-14 23:44:16 +00:00
paboyle	fc4ab9ccd5	Working half precision comms	2017-04-20 11:20:26 +01:00
paboyle	b9e8ea3aaa	conjugate coefficient on the dagger	2017-03-30 13:43:13 +09:00
paboyle	4e7ab3166f	Refactoring header layout	2017-02-22 18:09:33 +00:00
paboyle	3ae92fa2e6	Global changes to parallel_for structure. Move the comms flags to more sensible names	2017-02-21 05:24:27 -05:00
paboyle	bd0430b34f	Serialisation in malloc fixed	2016-11-29 22:27:55 +00:00
paboyle	90e70790f3	Feature for z-Mobius prep	2016-08-15 22:31:29 +01:00
paboyle	980ff18956	Solving the instantiation no compile issue	2016-07-15 17:19:44 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00

14 Commits