1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-25 02:04:48 +01:00
Commit Graph

4840 Commits

Author SHA1 Message Date
Peter Boyle
5622faf226 pragma once ifdef guard 2019-06-03 14:50:26 +01:00
Peter Boyle
82ecd520c7 Macos happy fix under nvcc 2019-06-03 14:48:50 +01:00
Peter Boyle
ffde81f22a Nsimd() and coalesced support 2019-05-25 12:44:07 +01:00
Peter Boyle
d8098f1ecd coalesced support 2019-05-25 12:43:31 +01:00
Peter Boyle
aca788cf4f Move coalesced read into tensors 2019-05-25 12:43:00 +01:00
Peter Boyle
a0e9f3b0a0 Plan for GPU port 2019-05-20 09:46:19 +01:00
Peter Boyle
a9342c6ae5 Udpdate TODO afer gianluc marge 2019-05-18 22:58:25 +01:00
Peter Boyle
ee6f96d85c Merge pull request #210 from grid-test-organisation/feature/gpu-port-develop
Cayley fermion functions for GPUs
2019-05-18 19:06:20 +01:00
Peter Boyle
4e9df9e93c GPU patches 2019-05-18 17:43:11 +01:00
Peter Boyle
9fe68857a9 Runs multiGPU with coalesced access on tesseract 2019-05-18 17:42:41 +01:00
Peter Boyle
37336c9e0c Allow compress to be either vector or scalar types 2019-05-18 17:41:13 +01:00
Peter Boyle
6c4da3bbc7 Stencil now runs with coalesced accesses 2019-05-18 17:40:35 +01:00
Peter Boyle
a584b16c4a Adding a non-blocking kernel launch 2019-05-18 17:39:54 +01:00
gfilaci
1a82533d22 fix inner product with thrust reduction 2019-05-14 15:35:54 +01:00
gfilaci
e3c56fd9b3 CayleyZeroCounters before benchmark loop 2019-05-13 15:52:00 +01:00
gfilaci
955cc7790f MooeeInvDag offloaded to GPU 2019-05-13 14:25:29 +01:00
gfilaci
1179123ac2 MooeeInv offloaded to GPU 2019-05-13 12:37:12 +01:00
gfilaci
22e35c9ddd M5Ddag offloaded to GPU 2019-05-10 12:23:39 +01:00
gfilaci
698b45e163 remove unused typedef 2019-05-09 11:19:39 +01:00
gfilaci
f1744b3f01 M5D offloaded to GPU 2019-05-09 11:17:55 +01:00
gfilaci
2b3c22f03d bandwidth dependent on grid default precision 2019-05-08 12:01:11 +01:00
gfilaci
8423a05940 duplicate CayleyFermion5D for gpu 2019-05-08 11:51:37 +01:00
gfilaci
d9438627d9 M5D benchmark without vector copy overhead 2019-05-02 11:10:57 +01:00
gfilaci
b23305dbe2 fix M5D flop count 2019-05-02 11:08:21 +01:00
gfilaci
d3b5c02e2d measure M5D bandwidth and fix M5D flop count 2019-05-02 11:02:39 +01:00
gfilaci
8b6541fb60 Fix gpu MultRealPart and MaddRealPart bug 2019-05-02 10:58:17 +01:00
gfilaci
6da9aa9971 replace std::vector with Vector in benchmark 2019-05-02 10:56:22 +01:00
gfilaci
44e0360b97 replace std::vector with Vector 2019-05-02 10:55:36 +01:00
gfilaci
9003c4a07c allocator copy constructor (to be fixed) 2019-05-02 10:53:37 +01:00
gfilaci
b52fa38f8c seed initialisation of RNG5 2019-05-02 10:36:09 +01:00
gfilaci
3f1c4d8789 fix comment hash 2019-05-02 10:24:36 +01:00
Peter Boyle
60330e05a3 NVCC wacky compiler options frozen. Possibly Cuda 9.2 specific 2019-04-28 07:39:33 +01:00
Peter Boyle
f9b8c0cccf Vector changes for UVM 2019-04-28 07:38:57 +01:00
Peter Boyle
3cad67e569 Compile on tesseract 2019-04-28 07:38:09 +01:00
Peter Boyle
170ba4e619 Ensure different MPI ranks use different GPUs. The mapping works on Tesseract. 2019-04-28 07:32:30 +01:00
Peter Boyle
204a090497 Inner product is not working on GPU. Why? 2019-04-28 07:31:56 +01:00
Peter Boyle
3c717c47ef GPU no compile on Wilson Multigrid fixed 2019-04-28 07:31:19 +01:00
Peter Boyle
c5e081d69c Re-Merge branch 'develop' into feature/gpu-port
Pull in Regensburg MultiGrid pull request
2019-01-03 01:50:16 +00:00
Peter Boyle
535a6aaf05 Update todo list 2019-01-02 22:07:51 +00:00
Peter Boyle
91a7fe247b Merge branch 'DanielRichtmann-feature/wilsonmg' into develop 2019-01-02 14:40:31 +00:00
Peter Boyle
8a1be021d3 Merge branch 'feature/wilsonmg' of https://github.com/DanielRichtmann/Grid into DanielRichtmann-feature/wilsonmg 2019-01-02 14:39:59 +00:00
Peter Boyle
e73b909a48 Make tests running past nvcc. Different NVCC versions proving tricky to keep happy. This is 9.2 2019-01-02 12:05:30 +00:00
Peter Boyle
a4d9200293 Fixing AVX 512 instantiation error. Need to move to extern templates urgently. 2019-01-02 00:27:07 +00:00
Peter Boyle
350508bdb3 pugixml problem 2019-01-01 16:38:54 +00:00
Peter Boyle
38852737e4 No compile fix on clang 2019-01-01 15:55:13 +00:00
Peter Boyle
802404c78c Remove warnings under NVCC and move parallel_for to thread-loop 2019-01-01 15:08:09 +00:00
Peter Boyle
0e9b591c1c NVCC warning suppression 2019-01-01 15:07:47 +00:00
Peter Boyle
c43a2b599a GPU support 2019-01-01 15:07:29 +00:00
Peter Boyle
8c91e82ee8 GPU clean up, remove parallel_for. Split into accelerator_loop, thread_loop
cases, and collides with parallel_for in thrust
2019-01-01 15:06:46 +00:00
Peter Boyle
9d866d062a GPU support improvements 2019-01-01 15:05:03 +00:00