1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-24 12:36:12 +01:00

Commit Graph

  • 1d252d0922 Accelerator inline Peter Boyle 2020-05-28 11:45:25 -04:00
  • 006cc8a8f1 Staggereed move to accelerator Peter Boyle 2020-05-28 08:33:06 -04:00
  • 4fedd8d29f switch to MPI_THREAD_SERIALIZED instead of SINGLE nmeyer-ur 2020-05-27 14:08:34 +02:00
  • cf2938688a Sycl unhappy fix Peter Boyle 2020-05-25 08:36:53 -07:00
  • ee63721bad int unhappiness sycl fix Peter Boyle 2020-05-25 08:36:24 -07:00
  • 22c5168d70 Sycl happier Peter Boyle 2020-05-25 08:35:56 -07:00
  • 949ac3cd24 Must avoid non-trivial copy constructors Peter Boyle 2020-05-25 08:35:28 -07:00
  • 7bc0166c1c SYCLL maknig happy - must avoid non ttrivial copy constructors Peter Boyle 2020-05-25 08:34:19 -07:00
  • cb0d1b3399 hopefullly fix buildd fail Peter Boyle 2020-05-24 21:27:00 -04:00
  • d1f1ccc705 HIP changes Peter Boyle 2020-05-24 21:18:49 -04:00
  • c7519a237a Assertions fail on HIP foor unknown reasons - dedbugging Peter Boyle 2020-05-24 14:02:47 -04:00
  • 32be2b13d3 Updates for HiP Peter Boyle 2020-05-24 14:00:55 -04:00
  • 92b342a477 Hip reduction too Peter Boyle 2020-05-24 13:50:28 -04:00
  • 556da86ac3 HIP fp16 Peter Boyle 2020-05-24 13:41:58 -04:00
  • 8285e41574 View location / access mode Peter Boyle 2020-05-21 16:14:41 -04:00
  • f999408e92 View locatoin and access mode Peter Boyle 2020-05-21 16:14:20 -04:00
  • a7abda89e2 View location & access mode Peter Boyle 2020-05-21 16:13:59 -04:00
  • 7860a50f70 Make view specify where and drive data motion - first cut. This is a compile tiime option --enable-unified=yes/no Peter Boyle 2020-05-21 16:13:16 -04:00
  • 6ddcef1bca fix build error enabling fcmla/mac in vector types for VLA nmeyer-ur 2020-05-21 21:21:03 +02:00
  • 8c5a5fdfce disable fcmla in vector type building for VLA nmeyer-ur 2020-05-21 19:41:42 +02:00
  • 046b1cbbc0 enable fcmla in tensor arithmetics; fixed-size works, VLA does not compile nmeyer-ur 2020-05-21 19:39:07 +02:00
  • a65ce237c1 clean up; Exch1 VLA sp+dp integrate, tested, working nmeyer-ur 2020-05-21 09:48:06 +02:00
  • cd27f1005d clean up; Exch1 sp integrate, tested, working nmeyer-ur 2020-05-21 08:45:43 +02:00
  • f8c0a59221 clean up; Exch1 dp integrate, tested, working nmeyer-ur 2020-05-21 02:48:14 +02:00
  • 832485699f save some cycles in HtoD and DtoH by direct instead of multi-pass conversion nmeyer-ur 2020-05-20 23:04:35 +02:00
  • 81484a4760 symmetrize Mult and MultAddComplex nmeyer-ur 2020-05-20 22:36:45 +02:00
  • 9a86059761 symmetrize VLA and fixed size build messages nmeyer-ur 2020-05-20 20:05:42 +02:00
  • b780b7b7a0 guard prevents multiple TOFU messages nmeyer-ur 2020-05-20 19:20:59 +02:00
  • 9e085bd04e guard prevents multiple A64FX build messages nmeyer-ur 2020-05-20 19:16:30 +02:00
  • 6c6812a5ca GB/s output ferben 2020-05-20 12:26:57 +01:00
  • 8358ee38c4 pull develop Christoph Lehner 2020-05-19 08:56:18 -04:00
  • 1f154fe652 some cleanup in BaryonUtils ferben 2020-05-19 13:48:56 +01:00
  • d708c0258d some cleanup in BaryonUtils ferben 2020-05-19 13:48:00 +01:00
  • a7635fd5ba summit mem Christoph Lehner 2020-05-18 17:52:26 -04:00
  • 6b6bf537d3 comment out mac in vector types nmeyer-ur 2020-05-18 20:31:44 +02:00
  • 323a651c71 correct typo nmeyer-ur 2020-05-18 19:58:27 +02:00
  • 9f212679f1 support fcmla in vector_types, untested nmeyer-ur 2020-05-18 19:55:18 +02:00
  • 032f7dde1a update SVE readme, asm generator nmeyer-ur 2020-05-18 19:10:36 +02:00
  • ebb60330c9 Automatic data motion options beginning Peter Boyle 2020-05-17 16:34:25 -04:00
  • 5aa60be17d SerialisableClassName method for serialisable enum, and boolean to test if a serialisable object is an enum portelli 2020-05-15 20:00:34 +01:00
  • 50b1db1e8b implemented correct _m form (using 3 operands instead of 2) nmeyer-ur 2020-05-15 10:01:05 +02:00
  • 015d8bb38a introduced assertions in Benchmark_wilson, removed data output from Benchmark_dwf nmeyer-ur 2020-05-15 09:15:50 +02:00
  • 10a34312dc some fixed-size code clean up nmeyer-ur 2020-05-14 23:20:16 +02:00
  • db8c0e7584 replaced _x form with _m form when using even/odd predication nmeyer-ur 2020-05-14 23:17:35 +02:00
  • 32fbdf4fb1 Merge pull request #5 from paboyle/develop Christoph Lehner 2020-05-13 09:02:56 +02:00
  • a9847aa866 Dependence fix Peter Boyle 2020-05-12 20:03:37 -04:00
  • 2e652431e5 No compile on summiit fix Peter Boyle 2020-05-12 18:56:47 -04:00
  • 8b5b55b682 Make tests all compile ccurrent Grid, mostly MdagM removal of norms fixes but a few minor issues fiixed too Peter Boyle 2020-05-12 17:57:24 -04:00
  • 0e3c49f687 TransposeIndex was broken by Christoph Peter Boyle 2020-05-12 17:57:01 -04:00
  • cb7ee37562 Close expressions in arg to cshift Peter Boyle 2020-05-12 17:56:40 -04:00
  • 82f71643a4 Remove the norm in MdagM Peter Boyle 2020-05-12 17:55:53 -04:00
  • d15ccad8a7 switched to vec* in Reduce nmeyer-ur 2020-05-12 20:41:14 +02:00
  • 0009b5cee8 updated SVE_README nmeyer-ur 2020-05-12 19:02:33 +02:00
  • 20d1941a45 enabled asm kernels for fixed-size A64FXFIXEDSIZE nmeyer-ur 2020-05-12 19:01:12 +02:00
  • d24d8e8398 Use X-direction as more bits meaningful on CUDA. 2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume Peter Boyle 2020-05-12 10:35:49 -04:00
  • 162e4bb567 no automatic prefetching for now Christoph Lehner 2020-05-12 07:01:23 -04:00
  • 07c0c02f8c Speed up Cshift Peter Boyle 2020-05-11 17:02:01 -04:00
  • 8c31c065b5 Keep the Vector fixed to protect it from realloc Peter Boyle 2020-05-11 17:00:30 -04:00
  • b7c76ede29 Removed some assertions in Test_simd and removed exit() in Reduce nmeyer-ur 2020-05-11 22:43:00 +02:00
  • 05edf803bd corrected typo nmeyer-ur 2020-05-12 03:59:59 +09:00
  • b1c86900b2 Merge pull request #4 from paboyle/develop Christoph Lehner 2020-05-11 20:59:29 +02:00
  • 78b8e40f83 switched to gcc's internal data types nmeyer-ur 2020-05-11 18:11:23 +02:00
  • fc2e9850d3 temporarily enable TOFU by default when using A64FX or A64FXFIXEDSIZE nmeyer-ur 2020-05-11 13:25:02 +02:00
  • ffaaed679e MPI_THREAD_SINGLE hack for Fugaku, enabled by -DTOFU nmeyer-ur 2020-05-11 13:21:39 +02:00
  • bbbee5660d First compiile on HiP Peter Boyle 2020-05-10 05:28:09 -04:00
  • ea08f193e7 Allocator cache spliit into large/small pools Peter Boyle 2020-05-10 05:24:26 -04:00
  • 2bb2c68e15 Separate pools for small and large allocations cache Peter Boyle 2020-05-09 22:57:21 -04:00
  • efe5bc6a3c Split allocator cache into two pools of different sizes Peter Boyle 2020-05-09 22:27:56 -04:00
  • b2fd8b993a fixed-size clean up nmeyer-ur 2020-05-09 22:53:42 +02:00
  • 291ee8c3d0 updated fixed-size implementation; only Exch1 and prefetches missing nmeyer-ur 2020-05-09 22:18:02 +02:00
  • e1a5b3ea49 unions for tables eliminate explicit loads, gcc does not complain nmeyer-ur 2020-05-09 21:21:57 +02:00
  • 55a55660cb reverted changes nmeyer-ur 2020-05-09 12:48:42 +02:00
  • 384da487bd Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2020-05-08 18:55:11 -04:00
  • ee1de82a53 Working ITT benchmark again Peter Boyle 2020-05-08 18:54:50 -04:00
  • 2b576fc185 Comment deadd codde remove Peter Boyle 2020-05-08 18:54:29 -04:00
  • 52081acfa5 NVCC compile fixes Peter Boyle 2020-05-08 13:14:12 -04:00
  • b01b7f761a Merge pull request #283 from DanielRichtmann/feature/minor-fixes Peter Boyle 2020-05-08 10:52:03 -04:00
  • c83471bfd0 Fix missing checkerboards for adj und conjugate Daniel Richtmann 2020-04-23 10:54:19 +02:00
  • ab0c5d77fb Correct NonHermitianSchurOperatorBase Daniel Richtmann 2020-04-22 19:50:30 +02:00
  • 779e3c7442 Const-correctness for retrieval routines of GridStopWatch Daniel Richtmann 2020-04-21 13:30:08 +02:00
  • 0c570824f2 Add missing declaration of GridCmdOptionInt Daniel Richtmann 2020-04-21 13:26:43 +02:00
  • f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc Aim to reduce the amount of cuda and other code variations floating around all over the place. Peter Boyle 2020-05-08 06:23:55 -07:00
  • 0dd1bdfa94 Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2020-05-08 09:21:43 -04:00
  • 1d65e2f62c Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics Lifteed from HDCR setup Peter Boyle 2020-05-08 09:20:54 -04:00
  • 93920c4811 Remove verbose Peter Boyle 2020-05-08 09:19:54 -04:00
  • 6859a3e1d4 Schur operator Peter Boyle 2020-05-08 09:19:12 -04:00
  • 21ca182c36 Comments remove Peter Boyle 2020-05-08 09:18:24 -04:00
  • ceb8b374da API change v3 nmeyer-ur 2020-05-08 15:04:44 +02:00
  • 4bc2ad2894 API change v2 nmeyer-ur 2020-05-08 15:00:25 +02:00
  • 798af3e68f retry changing StoD API nmeyer-ur 2020-05-08 14:34:59 +02:00
  • b0ef2367f3 testing alternate call to PrecisionChange nmeyer-ur 2020-05-08 14:22:44 +02:00
  • 71a7350a85 changed 2nd argument in Reduce to native vector type nmeyer-ur 2020-05-08 12:26:51 +02:00
  • 6f79369955 trying to get rid of macro definition error nmeyer-ur 2020-05-08 12:19:24 +02:00
  • f9cb6b979f corrected more typos nmeyer-ur 2020-05-08 12:11:01 +02:00
  • ed4d9d17f8 corrected type nmeyer-ur 2020-05-08 12:09:22 +02:00
  • fbed02690d some changes in breaking out A64FX: use -DA64FXFIXEDSIZE for fixed size, but also define GEN nmeyer-ur 2020-05-08 12:05:31 +02:00
  • 39f3ae5b1d corrected more types nmeyer-ur 2020-05-08 11:07:14 +02:00
  • e64bec8c8e pulled SVE typedefs out of Optimization nmeyer-ur 2020-05-08 11:04:21 +02:00
  • 0893b4e552 fixed typos in PrecisionChange nmeyer-ur 2020-05-08 10:59:07 +02:00
  • 92f0f29670 fixed double overloading vecf in Div, corrected typos nmeyer-ur 2020-05-08 10:57:23 +02:00