1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-08-17 03:31:54 +01:00

Commit Graph

  • a87e45ba25 SVE readme update nmeyer-ur 2020-06-18 11:23:08 +02:00
  • 465856331a switch back to serialized; wrong results on single too nmeyer-ur 2020-06-15 15:39:39 +02:00
  • cc958aa9ed switch back to standard MPI_init due to wrong results in Benchmark_wilson using comms-overlap nmeyer-ur 2020-06-15 14:21:38 +02:00
  • f46f029dbb Merge pull request #292 from lehner/feature/gpt-sycl Peter Boyle 2020-06-14 13:43:27 -04:00
  • 3dccd7aa2c Catch edge case in SharedMemoryMPI::GetShmDims; Change default units to consistent MB in init args; Want last element not past last element in MemoryManagerCache.cc Christoph Lehner 2020-06-14 13:26:01 -04:00
  • a25e4b3d0c pred 32/64 for float/double instead of 8 in VLA patch nmeyer-ur 2020-06-13 14:44:37 +02:00
  • d1210ca12a switch to double/float instead of float64_t/float32_t in VLA patch nmeyer-ur 2020-06-13 13:59:32 +02:00
  • 36ea0e222a type traits for ComplexF/D in VLA patch; cosmetics in VLS intrinsics nmeyer-ur 2020-06-13 13:42:35 +02:00
  • 65e6e7da6f Merge pull request #291 from lehner/feature/gpt-sycl Peter Boyle 2020-06-12 20:42:32 -04:00
  • b5e87e8d97 summit compile fixes Christoph Lehner 2020-06-12 18:16:12 -04:00
  • 5f5807d60a cleanup Christoph Lehner 2020-06-12 14:48:23 -04:00
  • 92281ec22d add 3 op Mult for VLA nmeyer-ur 2020-06-12 18:49:05 +02:00
  • 87266ce099 comment out fcmla in vector types: need also MultAddReal nmeyer-ur 2020-06-12 18:37:19 +02:00
  • 2a23f133e8 reenable fcmla for VLA nmeyer-ur 2020-06-12 17:30:38 +02:00
  • 8dbf790f62 correct tbl2 for sp nmeyer-ur 2020-06-12 17:12:34 +02:00
  • 2402b4940e vec_imm in float nmeyer-ur 2020-06-12 15:17:38 +02:00
  • 2111052fbe apply VLA patch for memcpy reduction suggested by Arm, CAS-162542-D6W7Z7 nmeyer-ur 2020-06-12 14:49:19 +02:00
  • 7974acff54 merged sycl to feature-gpt Christoph Lehner 2020-06-12 06:49:38 -04:00
  • f0d17d2b49 Added Baryon3pt code Raoul Hodgson 2020-06-12 11:35:52 +01:00
  • 244c003a1b Updated Baryon code Raoul Hodgson 2020-06-12 11:00:25 +01:00
  • 0174f5f742 look for librt when using shm=shmopen Antonin Portelli 2020-06-11 16:50:43 +01:00
  • 32b2b59be4 Offload Peter Boyle 2020-06-10 20:36:26 -04:00
  • 86bb0cc24b Keep on GPU Peter Boyle 2020-06-10 20:00:00 -04:00
  • 84c19587e7 Offload Peter Boyle 2020-06-10 19:59:31 -04:00
  • 237ce92540 Offload loops Peter Boyle 2020-06-10 19:59:11 -04:00
  • a7ffc61e82 acceleratorSIMTlane() Peter Boyle 2020-06-10 19:58:33 -04:00
  • fd97f64612 Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl Peter Boyle 2020-06-10 12:58:13 -04:00
  • 8720aecb80 Offload more loops Peter Boyle 2020-06-10 12:57:55 -04:00
  • cdf0a04fc5 Merge branch 'develop' into sycl Peter Boyle 2020-06-09 04:00:12 -04:00
  • 616d3dd737 CCommpile updates Peter Boyle 2020-06-08 18:57:41 -04:00
  • 8b066baca8 Implement transient mechanism Peter Boyle 2020-06-08 18:28:53 -04:00
  • e97f3688db Fix the HMC issue - kernel was launchnig asynchronously Peter Boyle 2020-06-08 17:01:15 -04:00
  • 433766ac62 revert Add/SubTimesI and prefetching in stencil nmeyer-ur 2020-06-08 12:02:53 +02:00
  • 93a37c8f68 test prefetch to L2 in stencil nmeyer-ur 2020-06-08 09:39:50 +02:00
  • 89a1e78390 Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl Peter Boyle 2020-06-05 23:20:37 -04:00
  • ffbb3fc02c Merge pull request #287 from felixerben/baryon-cleaner Peter Boyle 2020-06-05 22:54:52 -04:00
  • 5a73ef3647 Minor tweak to compile Peter Boyle 2020-06-05 21:50:15 -04:00
  • 87e5d2f4b7 Merge branch 'sycl' of https://www.github.com/paboyle/Grid into sycl Peter Boyle 2020-06-05 17:32:21 -07:00
  • d720f10758 Liink error fix Peter Boyle 2020-06-05 17:29:20 -07:00
  • 14fcd0912a Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl Peter Boyle 2020-06-05 19:14:17 -04:00
  • 3111c0bd4f Single precisiono hardwire Peter Boyle 2020-06-05 19:13:27 -04:00
  • e03064490e Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl Peter Boyle 2020-06-05 18:53:39 -04:00
  • 1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. Peter Boyle 2020-06-05 18:52:35 -04:00
  • 2b1e259441 Decode of SYCL devices fix Peter Boyle 2020-06-04 17:16:55 -07:00
  • f39c2a240b Priintinig and device memory size detection Peter Boyle 2020-06-04 14:58:03 -04:00
  • 86a9cc8c27 relative Eigen links, allows moving safely Grid's directory feature/eigen-relative Antonin Portelli 2020-06-04 10:56:34 +01:00
  • 0d95805cde Print improvement Peter Boyle 2020-06-03 22:50:32 -04:00
  • f67830587f Accelerator loop use Peter Boyle 2020-06-03 22:50:09 -04:00
  • 6bf7f839ff Better printing and logging Peter Boyle 2020-06-03 09:28:57 -04:00
  • e3147881a9 Cache scheme Peter Boyle 2020-06-03 09:23:48 -04:00
  • 9872c76825 introduce AddTimesI and SubTimesI; slight benefit in operators, but < 1%; breaks all other impls nmeyer-ur 2020-06-03 15:20:13 +02:00
  • fb559614ad Initialise meemory manager Peter Boyle 2020-06-03 09:12:47 -04:00
  • e93e12b6a4 More verbose SYCL setup Peter Boyle 2020-06-03 09:12:11 -04:00
  • 0c3112cd94 Use view mechanism Peter Boyle 2020-06-03 09:11:51 -04:00
  • 8cfd5d2639 Need lattice view Peter Boyle 2020-06-03 09:11:28 -04:00
  • 1c9f20b15e Views must be closed Peter Boyle 2020-06-03 09:10:29 -04:00
  • 32237895bd Reorg memory manager for O(1) hash table Peter Boyle 2020-06-03 09:09:52 -04:00
  • 5ee3ea2144 round-up after testing of prefetches in stencil close nmeyer-ur 2020-06-03 11:58:20 +02:00
  • c5c2dbc0ce Optional CUDA info Peter Boyle 2020-06-02 14:21:49 -04:00
  • 9fcb47ee63 Explicit error message instead of infinite loop in GlobalSharedMemory::GetShmDims Christoph Lehner 2020-06-02 07:44:38 -04:00
  • 5050833b42 revert changes due to performance penalty in Wilson using MPI nmeyer-ur 2020-06-02 13:08:57 +02:00
  • 7bee4ebb54 correct predication for svcadd nmeyer-ur 2020-06-02 10:51:39 +02:00
  • 71cf9851e7 correct type for vecd in TimesI and TimesMinusI nmeyer-ur 2020-06-02 10:44:15 +02:00
  • b4735c9904 correct zero in svcadd nmeyer-ur 2020-06-02 10:38:05 +02:00
  • 9b2699226c use fcadd in TimesI and TimesMinusI instead of tbl and neg nmeyer-ur 2020-06-02 10:32:44 +02:00
  • 5f52804907 update calculation of data nmeyer-ur 2020-05-30 10:55:17 +02:00
  • 936071773e correct throughput in wilson and dwf nmeyer-ur 2020-05-29 22:15:59 +02:00
  • 1732f9319e more mods; counters seem to work correctly nmeyer-ur 2020-05-29 18:44:00 +02:00
  • 91c81cab30 some corrections; compiles on my laptop; untested nmeyer-ur 2020-05-29 18:19:22 +02:00
  • 38164f8480 include counters in WilsonFermionImplementation.h nmeyer-ur 2020-05-29 17:59:26 +02:00
  • f013979791 add counter support in WilsonFermion.h nmeyer-ur 2020-05-29 17:13:59 +02:00
  • e947b563ea add space in stencil output nmeyer-ur 2020-05-29 17:11:17 +02:00
  • 5cb3530c34 enable counters in Benchmark_wilson nmeyer-ur 2020-05-29 15:44:52 +02:00
  • 250008372f update SVE readme nmeyer-ur 2020-05-29 15:44:25 +02:00
  • 1d252d0922 Accelerator inline Peter Boyle 2020-05-28 11:45:25 -04:00
  • 006cc8a8f1 Staggereed move to accelerator Peter Boyle 2020-05-28 08:33:06 -04:00
  • 4fedd8d29f switch to MPI_THREAD_SERIALIZED instead of SINGLE nmeyer-ur 2020-05-27 14:08:34 +02:00
  • cf2938688a Sycl unhappy fix Peter Boyle 2020-05-25 08:36:53 -07:00
  • ee63721bad int unhappiness sycl fix Peter Boyle 2020-05-25 08:36:24 -07:00
  • 22c5168d70 Sycl happier Peter Boyle 2020-05-25 08:35:56 -07:00
  • 949ac3cd24 Must avoid non-trivial copy constructors Peter Boyle 2020-05-25 08:35:28 -07:00
  • 7bc0166c1c SYCLL maknig happy - must avoid non ttrivial copy constructors Peter Boyle 2020-05-25 08:34:19 -07:00
  • cb0d1b3399 hopefullly fix buildd fail Peter Boyle 2020-05-24 21:27:00 -04:00
  • d1f1ccc705 HIP changes Peter Boyle 2020-05-24 21:18:49 -04:00
  • c7519a237a Assertions fail on HIP foor unknown reasons - dedbugging Peter Boyle 2020-05-24 14:02:47 -04:00
  • 32be2b13d3 Updates for HiP Peter Boyle 2020-05-24 14:00:55 -04:00
  • 92b342a477 Hip reduction too Peter Boyle 2020-05-24 13:50:28 -04:00
  • 556da86ac3 HIP fp16 Peter Boyle 2020-05-24 13:41:58 -04:00
  • 8285e41574 View location / access mode Peter Boyle 2020-05-21 16:14:41 -04:00
  • f999408e92 View locatoin and access mode Peter Boyle 2020-05-21 16:14:20 -04:00
  • a7abda89e2 View location & access mode Peter Boyle 2020-05-21 16:13:59 -04:00
  • 7860a50f70 Make view specify where and drive data motion - first cut. This is a compile tiime option --enable-unified=yes/no Peter Boyle 2020-05-21 16:13:16 -04:00
  • 6ddcef1bca fix build error enabling fcmla/mac in vector types for VLA nmeyer-ur 2020-05-21 21:21:03 +02:00
  • 8c5a5fdfce disable fcmla in vector type building for VLA nmeyer-ur 2020-05-21 19:41:42 +02:00
  • 046b1cbbc0 enable fcmla in tensor arithmetics; fixed-size works, VLA does not compile nmeyer-ur 2020-05-21 19:39:07 +02:00
  • a65ce237c1 clean up; Exch1 VLA sp+dp integrate, tested, working nmeyer-ur 2020-05-21 09:48:06 +02:00
  • cd27f1005d clean up; Exch1 sp integrate, tested, working nmeyer-ur 2020-05-21 08:45:43 +02:00
  • f8c0a59221 clean up; Exch1 dp integrate, tested, working nmeyer-ur 2020-05-21 02:48:14 +02:00
  • 832485699f save some cycles in HtoD and DtoH by direct instead of multi-pass conversion nmeyer-ur 2020-05-20 23:04:35 +02:00
  • 81484a4760 symmetrize Mult and MultAddComplex nmeyer-ur 2020-05-20 22:36:45 +02:00