1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-10 07:55:35 +00:00
Commit Graph

4855 Commits

Author SHA1 Message Date
Peter Boyle
7e27a5213a Tests builds clean. 2019-06-04 20:45:20 +01:00
Peter Boyle
ade4a126da Getting closer on the GPU port, but will start deleting 5th dim vectorised variants
for code maintainability
2019-06-04 11:53:44 +01:00
Peter Boyle
7b59ab5bd7 Compiling after reorganisation 2019-06-03 15:46:26 +01:00
Peter Boyle
fcd8cfe257 Gparity in 2019-06-03 15:45:09 +01:00
Peter Boyle
b4b53812cb Move implementation to specific implementation headers 2019-06-03 15:43:01 +01:00
Peter Boyle
085cac583f Implementation in header 2019-06-03 15:42:36 +01:00
Peter Boyle
25e3b8640c Move to header 2019-06-03 15:42:05 +01:00
Peter Boyle
44bbec50b0 Making GPU compile happy 2019-06-03 14:57:04 +01:00
Peter Boyle
ec68b67d5d Attempt at unified GPU and CPU kernel 2019-06-03 14:55:51 +01:00
Peter Boyle
778450e0c8 Move to implementation subdir 2019-06-03 14:53:56 +01:00
Peter Boyle
567aa5f366 Move to implementation subdir 2019-06-03 14:53:33 +01:00
Peter Boyle
2ab7e2b175 Force instantiation in .cc files.
Eventually move into multiple files
2019-06-03 14:52:59 +01:00
Peter Boyle
6f61be044d Dont instantiate in header 2019-06-03 14:52:01 +01:00
Peter Boyle
269e00509e Don't instantiate in header 2019-06-03 14:51:24 +01:00
Peter Boyle
a5e90b0ddc Making the kernels more GPU happy 2019-06-03 14:50:54 +01:00
Peter Boyle
5622faf226 pragma once ifdef guard 2019-06-03 14:50:26 +01:00
Peter Boyle
82ecd520c7 Macos happy fix under nvcc 2019-06-03 14:48:50 +01:00
Peter Boyle
ffde81f22a Nsimd() and coalesced support 2019-05-25 12:44:07 +01:00
Peter Boyle
d8098f1ecd coalesced support 2019-05-25 12:43:31 +01:00
Peter Boyle
aca788cf4f Move coalesced read into tensors 2019-05-25 12:43:00 +01:00
Peter Boyle
a0e9f3b0a0 Plan for GPU port 2019-05-20 09:46:19 +01:00
Peter Boyle
a9342c6ae5 Udpdate TODO afer gianluc marge 2019-05-18 22:58:25 +01:00
Peter Boyle
ee6f96d85c
Merge pull request #210 from grid-test-organisation/feature/gpu-port-develop
Cayley fermion functions for GPUs
2019-05-18 19:06:20 +01:00
Peter Boyle
4e9df9e93c GPU patches 2019-05-18 17:43:11 +01:00
Peter Boyle
9fe68857a9 Runs multiGPU with coalesced access on tesseract 2019-05-18 17:42:41 +01:00
Peter Boyle
37336c9e0c Allow compress to be either vector or scalar types 2019-05-18 17:41:13 +01:00
Peter Boyle
6c4da3bbc7 Stencil now runs with coalesced accesses 2019-05-18 17:40:35 +01:00
Peter Boyle
a584b16c4a Adding a non-blocking kernel launch 2019-05-18 17:39:54 +01:00
gfilaci
1a82533d22 fix inner product with thrust reduction 2019-05-14 15:35:54 +01:00
gfilaci
e3c56fd9b3 CayleyZeroCounters before benchmark loop 2019-05-13 15:52:00 +01:00
gfilaci
955cc7790f MooeeInvDag offloaded to GPU 2019-05-13 14:25:29 +01:00
gfilaci
1179123ac2 MooeeInv offloaded to GPU 2019-05-13 12:37:12 +01:00
gfilaci
22e35c9ddd M5Ddag offloaded to GPU 2019-05-10 12:23:39 +01:00
gfilaci
698b45e163 remove unused typedef 2019-05-09 11:19:39 +01:00
gfilaci
f1744b3f01 M5D offloaded to GPU 2019-05-09 11:17:55 +01:00
gfilaci
2b3c22f03d bandwidth dependent on grid default precision 2019-05-08 12:01:11 +01:00
gfilaci
8423a05940 duplicate CayleyFermion5D for gpu 2019-05-08 11:51:37 +01:00
gfilaci
d9438627d9 M5D benchmark without vector copy overhead 2019-05-02 11:10:57 +01:00
gfilaci
b23305dbe2 fix M5D flop count 2019-05-02 11:08:21 +01:00
gfilaci
d3b5c02e2d measure M5D bandwidth and fix M5D flop count 2019-05-02 11:02:39 +01:00
gfilaci
8b6541fb60 Fix gpu MultRealPart and MaddRealPart bug 2019-05-02 10:58:17 +01:00
gfilaci
6da9aa9971 replace std::vector with Vector in benchmark 2019-05-02 10:56:22 +01:00
gfilaci
44e0360b97 replace std::vector with Vector 2019-05-02 10:55:36 +01:00
gfilaci
9003c4a07c allocator copy constructor (to be fixed) 2019-05-02 10:53:37 +01:00
gfilaci
b52fa38f8c seed initialisation of RNG5 2019-05-02 10:36:09 +01:00
gfilaci
3f1c4d8789 fix comment hash 2019-05-02 10:24:36 +01:00
Peter Boyle
60330e05a3 NVCC wacky compiler options frozen. Possibly Cuda 9.2 specific 2019-04-28 07:39:33 +01:00
Peter Boyle
f9b8c0cccf Vector changes for UVM 2019-04-28 07:38:57 +01:00
Peter Boyle
3cad67e569 Compile on tesseract 2019-04-28 07:38:09 +01:00
Peter Boyle
170ba4e619 Ensure different MPI ranks use different GPUs. The mapping works on Tesseract. 2019-04-28 07:32:30 +01:00