Peter Boyle
e03064490e
Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl
2020-06-05 18:53:39 -04:00
Peter Boyle
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
Peter Boyle
2b1e259441
Decode of SYCL devices fix
2020-06-04 17:16:55 -07:00
Peter Boyle
f39c2a240b
Priintinig and device memory size detection
2020-06-04 14:58:03 -04:00
Peter Boyle
0d95805cde
Print improvement
2020-06-03 22:50:32 -04:00
Peter Boyle
f67830587f
Accelerator loop use
2020-06-03 22:50:09 -04:00
Peter Boyle
6bf7f839ff
Better printing and logging
2020-06-03 09:28:57 -04:00
Peter Boyle
e3147881a9
Cache scheme
2020-06-03 09:23:48 -04:00
Peter Boyle
fb559614ad
Initialise meemory manager
2020-06-03 09:12:47 -04:00
Peter Boyle
e93e12b6a4
More verbose SYCL setup
2020-06-03 09:12:11 -04:00
Peter Boyle
0c3112cd94
Use view mechanism
2020-06-03 09:11:51 -04:00
Peter Boyle
8cfd5d2639
Need lattice view
2020-06-03 09:11:28 -04:00
Peter Boyle
1c9f20b15e
Views must be closed
2020-06-03 09:10:29 -04:00
Peter Boyle
32237895bd
Reorg memory manager for O(1) hash table
2020-06-03 09:09:52 -04:00
Peter Boyle
c5c2dbc0ce
Optional CUDA info
2020-06-02 14:21:49 -04:00
Christoph Lehner
9fcb47ee63
Explicit error message instead of infinite loop in GlobalSharedMemory::GetShmDims
2020-06-02 07:44:38 -04:00
Peter Boyle
1d252d0922
Accelerator inline
2020-05-28 11:45:25 -04:00
Peter Boyle
006cc8a8f1
Staggereed move to accelerator
2020-05-28 08:33:06 -04:00
Peter Boyle
cf2938688a
Sycl unhappy fix
2020-05-25 08:36:53 -07:00
Peter Boyle
ee63721bad
int unhappiness sycl fix
2020-05-25 08:36:24 -07:00
Peter Boyle
22c5168d70
Sycl happier
2020-05-25 08:35:56 -07:00
Peter Boyle
949ac3cd24
Must avoid non-trivial copy constructors
2020-05-25 08:35:28 -07:00
Peter Boyle
7bc0166c1c
SYCLL maknig happy - must avoid non ttrivial copy constructors
2020-05-25 08:34:19 -07:00
Peter Boyle
cb0d1b3399
hopefullly fix buildd fail
2020-05-24 21:27:00 -04:00
Peter Boyle
d1f1ccc705
HIP changes
2020-05-24 21:18:49 -04:00
Peter Boyle
c7519a237a
Assertions fail on HIP foor unknown reasons - dedbugging
2020-05-24 14:02:47 -04:00
Peter Boyle
32be2b13d3
Updates for HiP
2020-05-24 14:00:55 -04:00
Peter Boyle
92b342a477
Hip reduction too
2020-05-24 13:50:28 -04:00
Peter Boyle
556da86ac3
HIP fp16
2020-05-24 13:41:58 -04:00
Peter Boyle
8285e41574
View location / access mode
2020-05-21 16:14:41 -04:00
Peter Boyle
f999408e92
View locatoin and access mode
2020-05-21 16:14:20 -04:00
Peter Boyle
a7abda89e2
View location & access mode
2020-05-21 16:13:59 -04:00
Peter Boyle
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
ferben
6c6812a5ca
GB/s output
2020-05-20 12:26:57 +01:00
Christoph Lehner
8358ee38c4
pull develop
2020-05-19 08:56:18 -04:00
ferben
1f154fe652
some cleanup in BaryonUtils
2020-05-19 13:48:56 +01:00
ferben
d708c0258d
some cleanup in BaryonUtils
2020-05-19 13:48:00 +01:00
Christoph Lehner
a7635fd5ba
summit mem
2020-05-18 17:52:26 -04:00
Peter Boyle
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
5aa60be17d
SerialisableClassName method for serialisable enum, and boolean to test if a serialisable object is an enum
2020-05-15 20:00:34 +01:00
Christoph Lehner
32fbdf4fb1
Merge pull request #5 from paboyle/develop
...
Sync upstream
2020-05-13 09:02:56 +02:00
Peter Boyle
a9847aa866
Dependence fix
2020-05-12 20:03:37 -04:00
Peter Boyle
2e652431e5
No compile on summiit fix
2020-05-12 18:56:47 -04:00
Peter Boyle
8b5b55b682
Make tests all compile ccurrent Grid, mostly MdagM removal of norms fixes but a few minor
...
issues fiixed too
2020-05-12 17:57:24 -04:00
Peter Boyle
0e3c49f687
TransposeIndex was broken by Christoph
2020-05-12 17:57:01 -04:00
Peter Boyle
cb7ee37562
Close expressions in arg to cshift
2020-05-12 17:56:40 -04:00
Peter Boyle
82f71643a4
Remove the norm in MdagM
2020-05-12 17:55:53 -04:00
Peter Boyle
d24d8e8398
Use X-direction as more bits meaningful on CUDA.
...
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume
e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.
Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
Christoph Lehner
162e4bb567
no automatic prefetching for now
2020-05-12 07:01:23 -04:00
Peter Boyle
07c0c02f8c
Speed up Cshift
2020-05-11 17:02:01 -04:00