1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 10:41:01 +01:00
Commit Graph

5924 Commits

Author SHA1 Message Date
Peter Boyle 8720aecb80 Offload more loops 2020-06-10 12:57:55 -04:00
Peter Boyle e97f3688db Fix the HMC issue - kernel was launchnig asynchronously 2020-06-08 17:01:15 -04:00
Peter Boyle 89a1e78390 Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl 2020-06-05 23:20:37 -04:00
Peter Boyle 5a73ef3647 Minor tweak to compile 2020-06-05 21:50:15 -04:00
Peter Boyle 87e5d2f4b7 Merge branch 'sycl' of https://www.github.com/paboyle/Grid into sycl 2020-06-05 17:32:21 -07:00
Peter Boyle d720f10758 Liink error fix 2020-06-05 17:29:20 -07:00
Peter Boyle 14fcd0912a Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl 2020-06-05 19:14:17 -04:00
Peter Boyle 3111c0bd4f Single precisiono hardwire 2020-06-05 19:13:27 -04:00
Peter Boyle e03064490e Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl 2020-06-05 18:53:39 -04:00
Peter Boyle 1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
Peter Boyle 2b1e259441 Decode of SYCL devices fix 2020-06-04 17:16:55 -07:00
Peter Boyle f39c2a240b Priintinig and device memory size detection 2020-06-04 14:58:03 -04:00
Peter Boyle 0d95805cde Print improvement 2020-06-03 22:50:32 -04:00
Peter Boyle f67830587f Accelerator loop use 2020-06-03 22:50:09 -04:00
Peter Boyle 6bf7f839ff Better printing and logging 2020-06-03 09:28:57 -04:00
Peter Boyle e3147881a9 Cache scheme 2020-06-03 09:23:48 -04:00
Peter Boyle fb559614ad Initialise meemory manager 2020-06-03 09:12:47 -04:00
Peter Boyle e93e12b6a4 More verbose SYCL setup 2020-06-03 09:12:11 -04:00
Peter Boyle 0c3112cd94 Use view mechanism 2020-06-03 09:11:51 -04:00
Peter Boyle 8cfd5d2639 Need lattice view 2020-06-03 09:11:28 -04:00
Peter Boyle 1c9f20b15e Views must be closed 2020-06-03 09:10:29 -04:00
Peter Boyle 32237895bd Reorg memory manager for O(1) hash table 2020-06-03 09:09:52 -04:00
Peter Boyle 1d252d0922 Accelerator inline 2020-05-28 11:45:25 -04:00
Peter Boyle 006cc8a8f1 Staggereed move to accelerator 2020-05-28 08:33:06 -04:00
Peter Boyle cf2938688a Sycl unhappy fix 2020-05-25 08:36:53 -07:00
Peter Boyle ee63721bad int unhappiness sycl fix 2020-05-25 08:36:24 -07:00
Peter Boyle 22c5168d70 Sycl happier 2020-05-25 08:35:56 -07:00
Peter Boyle 949ac3cd24 Must avoid non-trivial copy constructors 2020-05-25 08:35:28 -07:00
Peter Boyle 7bc0166c1c SYCLL maknig happy - must avoid non ttrivial copy constructors 2020-05-25 08:34:19 -07:00
Peter Boyle cb0d1b3399 hopefullly fix buildd fail 2020-05-24 21:27:00 -04:00
Peter Boyle d1f1ccc705 HIP changes 2020-05-24 21:18:49 -04:00
Peter Boyle c7519a237a Assertions fail on HIP foor unknown reasons - dedbugging 2020-05-24 14:02:47 -04:00
Peter Boyle 32be2b13d3 Updates for HiP 2020-05-24 14:00:55 -04:00
Peter Boyle 92b342a477 Hip reduction too 2020-05-24 13:50:28 -04:00
Peter Boyle 556da86ac3 HIP fp16 2020-05-24 13:41:58 -04:00
Peter Boyle 8285e41574 View location / access mode 2020-05-21 16:14:41 -04:00
Peter Boyle f999408e92 View locatoin and access mode 2020-05-21 16:14:20 -04:00
Peter Boyle a7abda89e2 View location & access mode 2020-05-21 16:13:59 -04:00
Peter Boyle 7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle ebb60330c9 Automatic data motion options beginning 2020-05-17 16:34:25 -04:00
Peter Boyle a9847aa866 Dependence fix 2020-05-12 20:03:37 -04:00
Peter Boyle d24d8e8398 Use X-direction as more bits meaningful on CUDA.
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume

e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.

Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
Peter Boyle 07c0c02f8c Speed up Cshift 2020-05-11 17:02:01 -04:00
Peter Boyle 8c31c065b5 Keep the Vector fixed to protect it from realloc 2020-05-11 17:00:30 -04:00
Peter Boyle bbbee5660d First compiile on HiP 2020-05-10 05:28:09 -04:00
Peter Boyle 52081acfa5 NVCC compile fixes 2020-05-08 13:14:12 -04:00
Peter Boyle f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
Aim to reduce the amount of cuda and other code variations floating around all over the place.

Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Peter Boyle 28a1fcaaff First compile against SYCL 2020-05-05 11:13:27 -07:00
u37294 04927d2e40 SYCL prep - no sycl just make it compile through DPC++ 2020-05-04 10:28:29 -07:00
u37294 7caed4edd9 dpc++ didn't like rdtsc() 2020-05-04 10:27:05 -07:00