1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-18 15:57:05 +01:00
Commit Graph

268 Commits

Author SHA1 Message Date
a7635fd5ba summit mem 2020-05-18 17:52:26 -04:00
ebb60330c9 Automatic data motion options beginning 2020-05-17 16:34:25 -04:00
32fbdf4fb1 Merge pull request #5 from paboyle/develop
Sync upstream
2020-05-13 09:02:56 +02:00
0e3c49f687 TransposeIndex was broken by Christoph 2020-05-12 17:57:01 -04:00
162e4bb567 no automatic prefetching for now 2020-05-12 07:01:23 -04:00
bbbee5660d First compiile on HiP 2020-05-10 05:28:09 -04:00
c83471bfd0 Fix missing checkerboards for adj und conjugate 2020-05-08 16:44:03 +02:00
f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
Aim to reduce the amount of cuda and other code variations floating around all over the place.

Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
3c6ffcb48c Merge branch 'develop' into feature/gpt 2020-05-06 15:03:35 +02:00
87984ece7d add Lattice_basis.h 2020-05-06 08:47:18 -04:00
e9b295f967 Synchronize blocking infrastructure with GPT 2020-05-06 08:42:28 -04:00
28a1fcaaff First compile against SYCL 2020-05-05 11:13:27 -07:00
6b64727161 disable comments 2020-05-05 05:05:36 -04:00
04863f8f38 debug new AcceleratorView 2020-05-04 16:07:03 -04:00
2a1387e992 rankInnerProduct 2020-05-03 17:27:11 -04:00
9bfa51bffb cleanup comment 2020-05-03 09:12:52 -04:00
38532753f4 interface cleanup 2020-05-03 08:58:32 -04:00
949be9605c fix pragmas 2020-05-02 16:20:03 -04:00
63cf201ee7 Add AdviseInfrequentUse 2020-05-02 11:38:42 -04:00
ddb192bac7 re-work double precision promotion for summit 2020-04-30 16:09:57 -04:00
5011753f4f Clean up warning 2020-04-30 10:23:48 -04:00
091d5c605e towards more precise blocking 2020-04-17 04:25:28 -04:00
327da332bb Merge branch 'develop' of https://github.com/paboyle/Grid into feature/gpt 2020-04-16 11:30:17 -04:00
6cdb09c884 Faster copy region 2020-04-10 11:10:52 -04:00
a65bc64f10 Accelerator peek poke 2020-04-10 11:09:59 -04:00
5fc8a273e7 Fused innerProduct + norm2 on first argument operation 2020-04-06 11:52:29 +02:00
c9b737a4e7 make trace,adj,transpose unary operators 2020-03-16 17:58:30 -04:00
68b45f6444 Lower left/upper right region cut paste 2020-02-06 15:50:26 -05:00
ef9b3e658a extra typedef 2020-02-06 15:47:14 -05:00
1bd87c35d7 Read coalescing on Nvidia 2020-01-27 12:29:56 -05:00
48008e4d8b Thread coordinate creation loop 2020-01-27 12:28:16 -05:00
9aafd20468 Simple block project promote runs faster on GPU 2019-12-17 05:01:39 -05:00
9e15474999 Accelerator loop attempt at speed up 2019-12-14 05:28:16 -05:00
152b525a4d Typo fix 2019-12-13 22:44:42 -05:00
d18994eddc offload more of mgrid to GPU 2019-12-13 22:08:11 -05:00
845a045493 Merge pull request #233 from giltirn/lanczos_fix
A few run /compile / memory leak fixes
2019-10-30 10:21:59 -04:00
5de9547db5 Removing old debug code 2019-10-08 15:51:28 +01:00
114ebb7914 Fixed Lanczos calling aligned alloc in threaded region hitting up against pointer-cache no-threading restrictions
Fixed Lattice::reset not compiling with new Grid explicit memory region handling
Fixed memory leak in Lattice::resize that occurs when data region has been previously allocated
2019-08-26 16:47:44 -04:00
be37dfb6f8 Remove debug code 2019-08-15 01:31:40 +01:00
3e49dc8a67 Reduction finished and hopefully fixes CI regression fail on single precisoin and force 2019-08-14 15:18:34 +01:00
ce97638bac Think the reduction is now sorted and cleaned up 2019-08-11 11:09:01 +01:00
9117f61109 GPU friendly 2019-07-31 01:22:54 +01:00
9dad7a0094 Reproducible reduction and axpy_norm offload from Gianluca.
Hopefully get CG running entirely on GPU
2019-07-30 00:14:12 +01:00
775eaee199 Fix for suspected Intel 2018.1 compiler bug under O3 2019-07-19 07:57:34 +01:00
d976e5c514 Pow is being awkward in thrust for reasons I don't understand. Possible thrust bug. 2019-06-16 12:05:11 +01:00
20359ca15f Coalesced loops. 2019-06-15 08:03:57 +01:00
736358b0cb Coalesced loops 2019-06-15 08:03:13 +01:00
6b692aa726 Thread loops 2019-06-15 08:02:26 +01:00
7f99e1cd3b Coalesced loops 2019-06-15 08:01:39 +01:00
f3c89df948 Thread loop changes 2019-06-15 08:00:37 +01:00