Peter Boyle
84c19587e7
Offload
2020-06-10 19:59:31 -04:00
Peter Boyle
a7ffc61e82
acceleratorSIMTlane()
2020-06-10 19:58:33 -04:00
Peter Boyle
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
Peter Boyle
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
Peter Boyle
f67830587f
Accelerator loop use
2020-06-03 22:50:09 -04:00
Peter Boyle
8cfd5d2639
Need lattice view
2020-06-03 09:11:28 -04:00
Peter Boyle
7bc0166c1c
SYCLL maknig happy - must avoid non ttrivial copy constructors
2020-05-25 08:34:19 -07:00
Peter Boyle
cb0d1b3399
hopefullly fix buildd fail
2020-05-24 21:27:00 -04:00
Peter Boyle
d1f1ccc705
HIP changes
2020-05-24 21:18:49 -04:00
Peter Boyle
92b342a477
Hip reduction too
2020-05-24 13:50:28 -04:00
Peter Boyle
8285e41574
View location / access mode
2020-05-21 16:14:41 -04:00
Peter Boyle
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Christoph Lehner
a7635fd5ba
summit mem
2020-05-18 17:52:26 -04:00
Peter Boyle
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
Christoph Lehner
32fbdf4fb1
Merge pull request #5 from paboyle/develop
...
Sync upstream
2020-05-13 09:02:56 +02:00
Peter Boyle
0e3c49f687
TransposeIndex was broken by Christoph
2020-05-12 17:57:01 -04:00
Christoph Lehner
162e4bb567
no automatic prefetching for now
2020-05-12 07:01:23 -04:00
Peter Boyle
bbbee5660d
First compiile on HiP
2020-05-10 05:28:09 -04:00
Daniel Richtmann
c83471bfd0
Fix missing checkerboards for adj und conjugate
2020-05-08 16:44:03 +02:00
Peter Boyle
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Christoph Lehner
3c6ffcb48c
Merge branch 'develop' into feature/gpt
2020-05-06 15:03:35 +02:00
Christoph Lehner
87984ece7d
add Lattice_basis.h
2020-05-06 08:47:18 -04:00
Christoph Lehner
e9b295f967
Synchronize blocking infrastructure with GPT
2020-05-06 08:42:28 -04:00
Peter Boyle
28a1fcaaff
First compile against SYCL
2020-05-05 11:13:27 -07:00
Christoph Lehner
6b64727161
disable comments
2020-05-05 05:05:36 -04:00
Christoph Lehner
04863f8f38
debug new AcceleratorView
2020-05-04 16:07:03 -04:00
Christoph Lehner
2a1387e992
rankInnerProduct
2020-05-03 17:27:11 -04:00
Christoph Lehner
9bfa51bffb
cleanup comment
2020-05-03 09:12:52 -04:00
Christoph Lehner
38532753f4
interface cleanup
2020-05-03 08:58:32 -04:00
Christoph Lehner
949be9605c
fix pragmas
2020-05-02 16:20:03 -04:00
Christoph Lehner
63cf201ee7
Add AdviseInfrequentUse
2020-05-02 11:38:42 -04:00
Christoph Lehner
ddb192bac7
re-work double precision promotion for summit
2020-04-30 16:09:57 -04:00
Peter Boyle
5011753f4f
Clean up warning
2020-04-30 10:23:48 -04:00
Christoph Lehner
091d5c605e
towards more precise blocking
2020-04-17 04:25:28 -04:00
Christoph Lehner
327da332bb
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/gpt
2020-04-16 11:30:17 -04:00
Peter Boyle
6cdb09c884
Faster copy region
2020-04-10 11:10:52 -04:00
Peter Boyle
a65bc64f10
Accelerator peek poke
2020-04-10 11:09:59 -04:00
Daniel Richtmann
5fc8a273e7
Fused innerProduct + norm2 on first argument operation
2020-04-06 11:52:29 +02:00
Christoph Lehner
c9b737a4e7
make trace,adj,transpose unary operators
2020-03-16 17:58:30 -04:00
Peter Boyle
68b45f6444
Lower left/upper right region cut paste
2020-02-06 15:50:26 -05:00
Peter Boyle
ef9b3e658a
extra typedef
2020-02-06 15:47:14 -05:00
Peter Boyle
1bd87c35d7
Read coalescing on Nvidia
2020-01-27 12:29:56 -05:00
Peter Boyle
48008e4d8b
Thread coordinate creation loop
2020-01-27 12:28:16 -05:00
Peter Boyle
9aafd20468
Simple block project promote runs faster on GPU
2019-12-17 05:01:39 -05:00
Peter Boyle
9e15474999
Accelerator loop attempt at speed up
2019-12-14 05:28:16 -05:00
Peter Boyle
152b525a4d
Typo fix
2019-12-13 22:44:42 -05:00
Peter Boyle
d18994eddc
offload more of mgrid to GPU
2019-12-13 22:08:11 -05:00
Chris K
845a045493
Merge pull request #233 from giltirn/lanczos_fix
...
A few run /compile / memory leak fixes
2019-10-30 10:21:59 -04:00
Fionn O hOgain
5de9547db5
Removing old debug code
2019-10-08 15:51:28 +01:00
Christopher Kelly
114ebb7914
Fixed Lanczos calling aligned alloc in threaded region hitting up against pointer-cache no-threading restrictions
...
Fixed Lattice::reset not compiling with new Grid explicit memory region handling
Fixed memory leak in Lattice::resize that occurs when data region has been previously allocated
2019-08-26 16:47:44 -04:00