portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 17:55:38 +00:00

Author	SHA1	Message	Date
paboyle	44188a5c6f	AVX512 fix	2018-03-05 00:32:24 +00:00
paboyle	3277bda130	View introduction to prepare for accelerator offload. Probably same problem exists for stencil object	2018-03-04 16:38:08 +00:00
paboyle	70e276e1ab	parallel_for elimination -> thread_loop	2018-01-28 01:01:14 +00:00
paboyle	2d0bcc2606	Zero changes, acceleartor on kernels and some thread loop changes	2018-01-27 23:47:38 +00:00
paboyle	c4f82e072b	_grid becomes private ; use Grid()§	2018-01-27 00:04:12 +00:00
paboyle	85771e97e9	Hide internal data	2018-01-26 23:04:46 +00:00
paboyle	72acb0e48f	Namespace, indent	2018-01-14 23:41:59 +00:00
paboyle	fc4ab9ccd5	Working half precision comms	2017-04-20 11:20:26 +01:00
paboyle	9fd23faadf	Pretty layout	2017-03-30 13:44:45 +09:00
paboyle	4e7ab3166f	Refactoring header layout	2017-02-22 18:09:33 +00:00
paboyle	3ae92fa2e6	Global changes to parallel_for structure. Move the comms flags to more sensible names	2017-02-21 05:24:27 -05:00
paboyle	3e6945cd65	Fixing AVX Z-mobius	2016-12-18 02:05:11 +00:00
paboyle	87be03006a	AVX 512 code broke other compiles; fixing	2016-12-18 01:45:09 +00:00
Peter Boyle	fa6acccf55	Zmobius asm	2016-12-18 00:56:19 +00:00
Peter Boyle	fe187e9ed3	Compiles and passes under ZMobius with assembler	2016-12-10 00:47:48 +00:00
Peter Boyle	0091b50f49	Zmobius working -- not asm yet	2016-12-09 22:51:32 +00:00
Peter Boyle	fb8d4b2357	Lots of debug on performance Mobius	2016-12-08 17:28:28 +00:00
Peter Boyle	e27c6b217c	Updating	2016-12-01 12:42:53 +00:00
paboyle	6adf35da54	Faster Mobius	2016-12-01 11:39:04 +00:00
paboyle	bd0430b34f	Serialisation in malloc fixed	2016-11-29 22:27:55 +00:00
paboyle	90e70790f3	Feature for z-Mobius prep	2016-08-15 22:31:29 +01:00
paboyle	980ff18956	Solving the instantiation no compile issue	2016-07-15 17:19:44 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00

23 Commits