1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-19 10:11:02 +01:00
Commit Graph

3814 Commits

Author SHA1 Message Date
paboyle 3e947527cb Move looping over "s" and "site" into kernels for GPU optimisatoin 2018-06-27 21:29:43 +01:00
paboyle 31f65beac8 Move site and Ls looping into the kernels 2018-06-27 21:28:48 +01:00
paboyle 38e2a32ac9 Single SIMD lane operations for CUDA 2018-06-27 21:28:06 +01:00
paboyle efa84ca50a Keep Cuda 9.1 happy 2018-06-27 21:27:32 +01:00
paboyle 5e96d6d04c Keep CUDA happy 2018-06-27 21:27:11 +01:00
paboyle df30bdc599 CUDA happy 2018-06-27 21:26:49 +01:00
paboyle 7f45222924 Diagnostics on memory alloc fail 2018-06-27 21:26:20 +01:00
paboyle dd891f5e3b Use NVCC to suppress device Eigen 2018-06-27 21:25:17 +01:00
paboyle 6c97a6a071 Coalescing version of the kernel 2018-06-13 20:52:29 +01:00
paboyle 73bb2d5128 Ugly hack to speed up compile on GPU; we don't use the hand kernels on GPU anyway so why compile 2018-06-13 20:35:28 +01:00
paboyle b710fec6ea Gpu code first version of specialised kernel 2018-06-13 20:34:39 +01:00
paboyle b2a8cd60f5 Doubled gauge field is useful 2018-06-13 20:27:47 +01:00
paboyle 867ee364ab Explicit instantiation hooks 2018-06-13 20:27:12 +01:00
paboyle 25becc9324 GPU tweaks for benchmarking; really necessary? 2018-06-13 20:26:07 +01:00
paboyle 94d1ae4c82 Some prep work for GPU shared memory. Need to be careful, as will try GPU direct
RDMA and inter-GPU memory sharing on SUmmit later
2018-06-13 20:24:06 +01:00
paboyle 2075b177ef CUDA_ARCH more carefule treatment 2018-06-13 20:22:34 +01:00
paboyle 847c761ccc Move sfw IEEE fp16 into central location 2018-06-13 20:22:01 +01:00
paboyle 8287ed8383 New GPU vector targets 2018-06-13 20:21:35 +01:00
paboyle e6be7416f4 Use managed memory 2018-06-13 20:14:00 +01:00
paboyle 26863b6d95 User Managed memory 2018-06-13 20:13:42 +01:00
paboyle ebd730bd54 Adding 2D loops 2018-06-13 20:13:01 +01:00
paboyle 066be31a3b Optional GPU target SIMD types; work in progress and trying experiments 2018-06-13 20:07:55 +01:00
paboyle 7a4c142955 Add GPU specific simd targets 2018-06-13 19:55:30 +01:00
Peter Boyle eb7d34a4cc GPU version 2018-05-14 19:41:47 -04:00
Peter Boyle aab27a655a Start of GPU kernels 2018-05-14 19:41:17 -04:00
Peter Boyle 93280bae85 Gpu option 2018-05-14 19:40:58 -04:00
Peter Boyle c5f93abcd7 GPU clean up 2018-05-14 19:40:33 -04:00
Peter Boyle d5deef782d Useful debug comments 2018-05-14 19:39:52 -04:00
Peter Boyle 5f50473c0d Clean up 2018-05-14 19:39:11 -04:00
Peter Boyle 13f50406e3 Suppress print statement 2018-05-12 18:00:00 -04:00
Peter Boyle 09cd46d337 Lane by Lane operation 2018-05-12 17:59:35 -04:00
Peter Boyle d3f51065c2 Give command line control of blocks/threads split 2018-05-12 17:58:56 -04:00
Peter Boyle 925ac4173d Thread count control for warp scheduler thingy doodaa thing 2018-05-12 17:58:22 -04:00
Peter Boyle eb921041d0 Perf count control 2018-05-12 17:57:32 -04:00
Peter Boyle 87c5c0271b Ficxing eigen 2018-04-16 19:08:07 -04:00
Peter Boyle a3f5a13591 Better Eigen handling 2018-04-16 18:02:55 -04:00
Peter Boyle 9fe28f00eb Eigen sim link off head revision 2018-04-16 17:54:46 -04:00
Peter Boyle a8a0bb85cc Control scalar execution or vector under generic. Disable Eigen vectorisation on powerpc / SUmmit 2018-04-12 12:32:57 -04:00
Peter Boyle 6411caad67 work distribution 2018-04-12 11:41:41 -04:00
Peter Boyle 7533035a99 Control Eigen vectorisatoin 2018-04-12 11:40:56 -04:00
Peter Boyle b15db11c60 Kernels -> pure static object to enable device execution 2018-03-24 19:35:20 -04:00
Peter Boyle f6077f9d48 Kernels -> not instantiaed otherwise object ref on GPU 2018-03-24 19:33:44 -04:00
Peter Boyle 572954ef12 Kernels not an instantiated object, just static 2018-03-24 19:33:13 -04:00
Peter Boyle cedeaae7db Lebesge -> StencilView if necessary 2018-03-24 19:32:41 -04:00
Peter Boyle e6cf0b1e17 View typedefs go to OperatorImpl 2018-03-24 19:32:11 -04:00
Peter Boyle 5412628ea6 begin end lamda 2018-03-24 19:31:45 -04:00
Peter Boyle 1f70cedbab Have to make all kernel called routines static since object reference will be a host pointer on GPU 2018-03-24 19:29:26 -04:00
Peter Boyle b50f37cfb4 Remove overlap comms flag 2018-03-24 19:28:53 -04:00
Peter Boyle cb0d2a1b03 threaded rng init; I thought this was on 2018-03-24 19:28:17 -04:00
Peter Boyle 6fe9b28a82 Cosmetic 2018-03-24 19:27:14 -04:00