Peter Boyle
c273fb051c
Peek poke laattice
2020-09-01 15:27:59 -04:00
Peter Boyle
e21fef17df
real and imag part not in ET
2020-08-31 23:56:26 -04:00
Peter Boyle
7d14a3c086
Where working
2020-08-31 23:53:46 -04:00
Peter Boyle
9522dcd611
Remove dead commented ouot coode
2020-08-31 23:40:29 -04:00
Peter Boyle
ed469898dc
coalesced ET expressions
2020-08-31 23:38:40 -04:00
Peter Boyle
1eee94a809
Sorting real/im in read coalesced GPU ET
2020-08-31 23:36:49 -04:00
Peter Boyle
3448b7387c
Almost there to coalesced ET
2020-08-26 17:04:49 -04:00
Christoph Lehner
f0dc0f3621
fix compile issue on Qpace3
2020-08-22 13:57:33 +02:00
Christoph Lehner
dbaa24ebf6
further GPU memory access fixes (with this GPT passes all single-rank tests on non-summit GPUs)
2020-08-13 16:14:15 +02:00
Christoph Lehner
27b4fbf3f0
assert for forbidden code path and fix check for faster CPU codepath in basisRotate
2020-08-03 07:57:33 -04:00
Christoph Lehner
197612bc7a
fast cpu basisRotate and other small cleanups
2020-07-30 07:08:54 -04:00
Peter Boyle
936c5ecf69
Reduction GPU no compile fix
2020-06-24 17:28:31 -04:00
Peter Boyle
22cfbdbbb3
Boost precision in inner products in single
2020-06-24 12:52:31 -04:00
Peter Boyle
b949cf6b12
PeekLocal needs a view to keep thread safe.
...
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
Christoph Lehner
b5e87e8d97
summit compile fixes
2020-06-12 18:16:12 -04:00
Christoph Lehner
5f5807d60a
cleanup
2020-06-12 14:48:23 -04:00
Christoph Lehner
7974acff54
merged sycl to feature-gpt
2020-06-12 06:49:38 -04:00
Peter Boyle
84c19587e7
Offload
2020-06-10 19:59:31 -04:00
Peter Boyle
a7ffc61e82
acceleratorSIMTlane()
2020-06-10 19:58:33 -04:00
Peter Boyle
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
Peter Boyle
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
Peter Boyle
f67830587f
Accelerator loop use
2020-06-03 22:50:09 -04:00
Peter Boyle
8cfd5d2639
Need lattice view
2020-06-03 09:11:28 -04:00
Peter Boyle
7bc0166c1c
SYCLL maknig happy - must avoid non ttrivial copy constructors
2020-05-25 08:34:19 -07:00
Peter Boyle
cb0d1b3399
hopefullly fix buildd fail
2020-05-24 21:27:00 -04:00
Peter Boyle
d1f1ccc705
HIP changes
2020-05-24 21:18:49 -04:00
Peter Boyle
92b342a477
Hip reduction too
2020-05-24 13:50:28 -04:00
Peter Boyle
8285e41574
View location / access mode
2020-05-21 16:14:41 -04:00
Peter Boyle
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Christoph Lehner
a7635fd5ba
summit mem
2020-05-18 17:52:26 -04:00
Peter Boyle
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
Christoph Lehner
32fbdf4fb1
Merge pull request #5 from paboyle/develop
...
Sync upstream
2020-05-13 09:02:56 +02:00
Peter Boyle
0e3c49f687
TransposeIndex was broken by Christoph
2020-05-12 17:57:01 -04:00
Christoph Lehner
162e4bb567
no automatic prefetching for now
2020-05-12 07:01:23 -04:00
Peter Boyle
bbbee5660d
First compiile on HiP
2020-05-10 05:28:09 -04:00
Daniel Richtmann
c83471bfd0
Fix missing checkerboards for adj und conjugate
2020-05-08 16:44:03 +02:00
Peter Boyle
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Christoph Lehner
3c6ffcb48c
Merge branch 'develop' into feature/gpt
2020-05-06 15:03:35 +02:00
Christoph Lehner
87984ece7d
add Lattice_basis.h
2020-05-06 08:47:18 -04:00
Christoph Lehner
e9b295f967
Synchronize blocking infrastructure with GPT
2020-05-06 08:42:28 -04:00
Peter Boyle
28a1fcaaff
First compile against SYCL
2020-05-05 11:13:27 -07:00
Christoph Lehner
6b64727161
disable comments
2020-05-05 05:05:36 -04:00
Christoph Lehner
04863f8f38
debug new AcceleratorView
2020-05-04 16:07:03 -04:00
Christoph Lehner
2a1387e992
rankInnerProduct
2020-05-03 17:27:11 -04:00
Christoph Lehner
9bfa51bffb
cleanup comment
2020-05-03 09:12:52 -04:00
Christoph Lehner
38532753f4
interface cleanup
2020-05-03 08:58:32 -04:00
Christoph Lehner
949be9605c
fix pragmas
2020-05-02 16:20:03 -04:00
Christoph Lehner
63cf201ee7
Add AdviseInfrequentUse
2020-05-02 11:38:42 -04:00
Christoph Lehner
ddb192bac7
re-work double precision promotion for summit
2020-04-30 16:09:57 -04:00
Peter Boyle
5011753f4f
Clean up warning
2020-04-30 10:23:48 -04:00