portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2025-12-10 16:14:39 +00:00

Author	SHA1	Message	Date
Peter Boyle	19b527e83f	Better extract merge for GPU. Let the SIMD header files define the pointer type for access. GPU redirects through builtin float2, double2 for complex	2018-07-05 07:05:13 -04:00
Peter Boyle	4730d4692a	Fast lane extract, saturates bandwidth on Volta for SU3 benchmarks	2018-07-05 07:03:33 -04:00
Peter Boyle	1bb456c0c5	Minor GPU vector width changeÂ	2018-07-05 07:02:04 -04:00
paboyle	3a50afe7e7	GPU dslash updates	2018-06-27 22:32:21 +01:00
paboyle	f8e880b445	Loop for s and xyzt offlow	2018-06-27 21:49:57 +01:00
paboyle	3e947527cb	Move looping over "s" and "site" into kernels for GPU optimisatoin	2018-06-27 21:29:43 +01:00
paboyle	31f65beac8	Move site and Ls looping into the kernels	2018-06-27 21:28:48 +01:00
paboyle	38e2a32ac9	Single SIMD lane operations for CUDA	2018-06-27 21:28:06 +01:00
paboyle	efa84ca50a	Keep Cuda 9.1 happy	2018-06-27 21:27:32 +01:00
paboyle	5e96d6d04c	Keep CUDA happy	2018-06-27 21:27:11 +01:00
paboyle	df30bdc599	CUDA happy	2018-06-27 21:26:49 +01:00
paboyle	7f45222924	Diagnostics on memory alloc fail	2018-06-27 21:26:20 +01:00
paboyle	dd891f5e3b	Use NVCC to suppress device Eigen	2018-06-27 21:25:17 +01:00
paboyle	6c97a6a071	Coalescing version of the kernel	2018-06-13 20:52:29 +01:00
paboyle	73bb2d5128	Ugly hack to speed up compile on GPU; we don't use the hand kernels on GPU anyway so why compile	2018-06-13 20:35:28 +01:00
paboyle	b710fec6ea	Gpu code first version of specialised kernel	2018-06-13 20:34:39 +01:00
paboyle	b2a8cd60f5	Doubled gauge field is useful	2018-06-13 20:27:47 +01:00
paboyle	867ee364ab	Explicit instantiation hooks	2018-06-13 20:27:12 +01:00
paboyle	94d1ae4c82	Some prep work for GPU shared memory. Need to be careful, as will try GPU direct RDMA and inter-GPU memory sharing on SUmmit later	2018-06-13 20:24:06 +01:00
paboyle	2075b177ef	CUDA_ARCH more carefule treatment	2018-06-13 20:22:34 +01:00
paboyle	847c761ccc	Move sfw IEEE fp16 into central location	2018-06-13 20:22:01 +01:00
paboyle	8287ed8383	New GPU vector targets	2018-06-13 20:21:35 +01:00
paboyle	e6be7416f4	Use managed memory	2018-06-13 20:14:00 +01:00
paboyle	26863b6d95	User Managed memory	2018-06-13 20:13:42 +01:00
paboyle	ebd730bd54	Adding 2D loops	2018-06-13 20:13:01 +01:00
paboyle	066be31a3b	Optional GPU target SIMD types; work in progress and trying experiments	2018-06-13 20:07:55 +01:00
Peter Boyle	eb7d34a4cc	GPU version	2018-05-14 19:41:47 -04:00
Peter Boyle	aab27a655a	Start of GPU kernels	2018-05-14 19:41:17 -04:00
Peter Boyle	93280bae85	Gpu option	2018-05-14 19:40:58 -04:00
Peter Boyle	c5f93abcd7	GPU clean up	2018-05-14 19:40:33 -04:00
Peter Boyle	d5deef782d	Useful debug comments	2018-05-14 19:39:52 -04:00
Peter Boyle	5f50473c0d	Clean up	2018-05-14 19:39:11 -04:00
Peter Boyle	13f50406e3	Suppress print statement	2018-05-12 18:00:00 -04:00
Peter Boyle	09cd46d337	Lane by Lane operation	2018-05-12 17:59:35 -04:00
Peter Boyle	d3f51065c2	Give command line control of blocks/threads split	2018-05-12 17:58:56 -04:00
Peter Boyle	925ac4173d	Thread count control for warp scheduler thingy doodaa thing	2018-05-12 17:58:22 -04:00
Peter Boyle	a8a0bb85cc	Control scalar execution or vector under generic. Disable Eigen vectorisation on powerpc / SUmmit	2018-04-12 12:32:57 -04:00
Peter Boyle	6411caad67	work distribution	2018-04-12 11:41:41 -04:00
Peter Boyle	7533035a99	Control Eigen vectorisatoin	2018-04-12 11:40:56 -04:00
Peter Boyle	b15db11c60	Kernels -> pure static object to enable device execution	2018-03-24 19:35:20 -04:00
Peter Boyle	f6077f9d48	Kernels -> not instantiaed otherwise object ref on GPU	2018-03-24 19:33:44 -04:00
Peter Boyle	572954ef12	Kernels not an instantiated object, just static	2018-03-24 19:33:13 -04:00
Peter Boyle	cedeaae7db	Lebesge -> StencilView if necessary	2018-03-24 19:32:41 -04:00
Peter Boyle	e6cf0b1e17	View typedefs go to OperatorImpl	2018-03-24 19:32:11 -04:00
Peter Boyle	5412628ea6	begin end lamda	2018-03-24 19:31:45 -04:00
Peter Boyle	1f70cedbab	Have to make all kernel called routines static since object reference will be a host pointer on GPU	2018-03-24 19:29:26 -04:00
Peter Boyle	b50f37cfb4	Remove overlap comms flag	2018-03-24 19:28:53 -04:00
Peter Boyle	cb0d2a1b03	threaded rng init; I thought this was on	2018-03-24 19:28:17 -04:00
Peter Boyle	4e1272fabf	Kernels need to be static to work on GPU. No reference to host resident data	2018-03-22 18:44:53 -04:00
Peter Boyle	607dc2d3c6	Remove lebesgue order	2018-03-22 18:23:09 -04:00

1 2 3 4 5 ...

2566 Commits