1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-08-11 16:57:06 +01:00

Commit Graph

  • 006268f556 DWF Slow version Peter Boyle 2022-11-02 20:24:51 -04:00
  • 78acae9b50 Simple DWF for easy check Peter Boyle 2022-11-02 20:24:17 -04:00
  • a3927a8a27 Dirichlet Peter Boyle 2022-11-02 20:22:27 -04:00
  • d9dd9a5b5f LLVM update Peter Boyle 2022-11-02 19:51:50 -04:00
  • eae1c02111 Bounds check Peter Boyle 2022-11-02 19:50:32 -04:00
  • 132d841b05 Compile fix Peter Boyle 2022-11-02 19:33:22 -04:00
  • 62e52de06d Merge pull request #414 from fjosw/feat/eCloverGPU Peter Boyle 2022-11-01 09:15:44 -04:00
  • 184adeedb8 feat: renamed open_boundaries to fixedBoundaries Fabian Joswig 2022-10-26 12:53:46 +01:00
  • 5fa6a8b96d docs: CompactClover debug info generalized. Fabian Joswig 2022-10-26 12:40:28 +01:00
  • a2a879b668 docs: CompactClover Debug Info improved. Fabian Joswig 2022-10-25 17:20:42 +01:00
  • 9317d893b2 docs: details about inversion of CompactClover term added. Fabian Joswig 2022-10-25 17:10:06 +01:00
  • 86075fdd45 feat: MassTerm and ExponentiateClover merged into InstantiateClover Fabian Joswig 2022-10-25 17:05:34 +01:00
  • b36442e263 feat: CloverHelpers::InvertClover implemented which handles the inversion of the Clover term depending on clover type and the boundary conditions. Fabian Joswig 2022-10-25 16:57:01 +01:00
  • 513d797ea6 fix: signature of CompactWilsonCloverHelpers::Exponentiate fixed. Fabian Joswig 2022-10-25 16:17:22 +01:00
  • 9e4835a3e3 feat: changed CompactWilsonExpClover exponentiation to Taylor expansion with Horner scheme. Fabian Joswig 2022-10-25 15:19:43 +01:00
  • 2e8c3b0ddb Slow implementation of Shamir DWF Peter Boyle 2022-10-18 18:10:01 -04:00
  • 991667ba5e Revert Peter Boyle 2022-10-13 18:50:35 -04:00
  • 8a07b52009 Dirichlet Peter Boyle 2022-10-13 18:44:47 -04:00
  • 2bcff94b52 Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet Peter Boyle 2022-10-13 18:42:04 -04:00
  • d089739e2f Hack for lattice sites Peter Boyle 2022-10-13 17:55:50 -04:00
  • 204c283e16 Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet Peter Boyle 2022-10-11 14:59:07 -04:00
  • 551a5f8dc8 RRII gpu option Peter Boyle 2022-10-11 14:44:55 -04:00
  • c82b164f6b Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet Peter Boyle 2022-10-04 17:41:48 -04:00
  • 584a3ee45c Merge pull request #412 from giltirn/patch/adaptive-wflow Peter Boyle 2022-10-04 17:23:19 -04:00
  • eec0c9eb7d Merge pull request #411 from giltirn/patch/dirichlet-fixes Peter Boyle 2022-10-04 17:22:01 -04:00
  • 477ebf24f4 Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2022-10-04 11:19:43 -07:00
  • 0d5639f707 Run script update Peter Boyle 2022-10-04 11:13:41 -07:00
  • 413312f9a9 Benchmark the halo construction. THe bye counts are out and should be doubled for SIMD directions Peter Boyle 2022-10-04 11:12:59 -07:00
  • 03508448f8 Remove verbose Peter Boyle 2022-10-04 11:12:15 -07:00
  • e1e5c75023 Stencil gather improvements - SVM was running slow and used for a pointer array that wasn't needed to be in SVM Peter Boyle 2022-10-04 11:11:10 -07:00
  • 9296299b61 Better commenting Peter Boyle 2022-10-04 11:10:34 -07:00
  • 66d001ec9e Refactored Wilson flow class; previously the class implemented both iterative and adaptive smearing, but only the iterative method was accessible through the Smearing base class. The implementation of Smearing also forced a clunky need to pass iterative smearing parameters through the constructor but adaptive smearing parameters through the function call. Now there is a WilsonFlowBase class that implements common functionality, and separate WilsonFlow (iterative) and WilsonFlowAdaptive (adaptive) classes, both of which implement Smearing virtual functions. Christopher Kelly 2022-10-03 10:59:38 -04:00
  • fad2f969d9 Summit up to date Peter Boyle 2022-09-27 10:58:43 -04:00
  • 48165c1dc1 Ticked off a few items Peter Boyle 2022-09-27 10:58:00 -04:00
  • 25df2d2c3b Various precision options Peter Boyle 2022-09-27 10:57:12 -04:00
  • af9ecb8b41 Current tests compiling Peter Boyle 2022-09-27 10:56:55 -04:00
  • 234324599e Double2 Peter Boyle 2022-09-27 10:56:10 -04:00
  • 97448a93dc Double2 compiles and dslash runs Peter Boyle 2022-09-27 10:55:25 -04:00
  • 70c83ec3be More instantiations Peter Boyle 2022-09-27 10:54:23 -04:00
  • 8f4e2ee545 Double2 Peter Boyle 2022-09-27 10:53:46 -04:00
  • e8bfbf2f7c D2 operators Peter Boyle 2022-09-27 10:37:45 -04:00
  • 9e81b42981 D2 fields Peter Boyle 2022-09-27 10:37:19 -04:00
  • 6c9eef9726 D2 fields Peter Boyle 2022-09-27 10:36:54 -04:00
  • 7ffbc3e98e Double2 improved. REally don't like 'convertType' - localise to a GPT header Peter Boyle 2022-09-27 10:35:31 -04:00
  • 68e4d833dd Run through wrapper script Peter Boyle 2022-09-23 16:49:29 -04:00
  • a2cefaa53a Faster Peter Boyle 2022-09-23 16:49:14 -04:00
  • a0d682687e Better logging of Fdt for force gradient Peter Boyle 2022-09-23 16:22:53 -04:00
  • eb552c3ecd dt info Peter Boyle 2022-09-23 16:22:28 -04:00
  • 97cce103d7 Tolerances control Peter Boyle 2022-09-23 16:21:49 -04:00
  • 87ac7104f8 Prettier Peter Boyle 2022-09-23 16:20:46 -04:00
  • e4c117aabf Compile fix, multishift mixed prec support Peter Boyle 2022-09-23 16:19:27 -04:00
  • 5b128a6f9f MixedPrec Multishift with better precision scheme for GPU Peter Boyle 2022-09-23 16:18:47 -04:00
  • 19da647e3c Added support for non-periodic gauge field implementations in the random gauge shift performed at the start of the HMC trajectory (The above required exposing the gauge implementation to the HMC class through the Integrator class) Made the random shift optional (default on) through a parameter in HMCparameters Modified ConjugateBC::CshiftLink such that it supports any shift in -L < shift < L rather than just +-1 Added a tester for the BC-respecting Cshift Fixed a missing system header include in SSE4 intrinsics wrapper Fixed sumD_cpu for single-prec types performing an incorrect conversion to a single-prec data type at the end, that fails to compile on some systems Christopher Kelly 2022-09-09 12:47:09 -04:00
  • 1713de35c0 Improved config flags Peter Boyle 2022-09-05 21:50:02 -04:00
  • 1177b8f661 Merge branch 'develop' into feature/dirichlet Peter Boyle 2022-08-31 19:05:57 -04:00
  • 442bfb3d42 Merge branch 'develop' into feature/dirichlet Peter Boyle 2022-08-31 19:04:19 -04:00
  • e7d9b75fdd Warning fixes Peter Boyle 2022-08-31 19:01:14 -04:00
  • 3d0e3ec363 Tracing Peter Boyle 2022-08-31 18:31:46 -04:00
  • 3c1c51f9aa Merge branch 'feature/dirichlet-gparity' into feature/dirichlet Peter Boyle 2022-08-31 18:25:34 -04:00
  • 8cc3c522c3 Merge pull request #409 from giltirn/feature/dirichlet-gparity-stage feature/dirichlet-gparity Peter Boyle 2022-08-31 18:22:50 -04:00
  • 913fbca74a Merge pull request #410 from gkanwar/photon_and_sha_patches Peter Boyle 2022-08-31 18:01:45 -04:00
  • 5c87342108 Used in g-2 sign off Peter Boyle 2022-08-31 17:35:32 -04:00
  • 66177bfbe2 Used in g-2 sign off Peter Boyle 2022-08-31 17:35:07 -04:00
  • 5205e68963 RocTX, NVTX, text based self profiling Peter Boyle 2022-08-31 17:34:09 -04:00
  • cd5cf6d614 Tracing replaces self timing hooks Peter Boyle 2022-08-31 17:33:41 -04:00
  • 5abb19eab0 Remove self timing Peter Boyle 2022-08-31 17:32:49 -04:00
  • 06d7b88c78 Force reporting improved Peter Boyle 2022-08-31 17:32:21 -04:00
  • cf72799735 Better action naming Peter Boyle 2022-08-31 17:24:11 -04:00
  • cdb8fcc269 Width=4 support. This is too broad; hit it on physical point run. Need to change strategy, I think. Peter Boyle 2022-08-31 17:21:33 -04:00
  • b4f4130901 Defer SMP node links until after interior. Allows for DMA overlapping compute Peter Boyle 2022-08-31 17:20:21 -04:00
  • bb049847d5 Tracing replaces self timing Peter Boyle 2022-08-31 17:19:02 -04:00
  • fd33c835dd Feynman rule fix and tracing replaces self timing Peter Boyle 2022-08-31 17:18:17 -04:00
  • 21371a7e5b Tracing replaces self timing Peter Boyle 2022-08-31 17:16:05 -04:00
  • abfaa00d3e Tracing replaces self timing Peter Boyle 2022-08-31 17:15:24 -04:00
  • efee33c55d Tracing replaces self timing Peter Boyle 2022-08-31 17:14:57 -04:00
  • db0fe6ddbb Tracing replaces self timinng Peter Boyle 2022-08-31 17:14:14 -04:00
  • 8a9e647120 Tracing replaces self timing Peter Boyle 2022-08-31 17:13:44 -04:00
  • e6dcb821ad Tracing replaces self timing Peter Boyle 2022-08-31 17:12:31 -04:00
  • 9bff188f02 Tracing replaces self timing Peter Boyle 2022-08-31 17:12:05 -04:00
  • 111b30ca1d Tracing replaces self timing Peter Boyle 2022-08-31 17:11:48 -04:00
  • 24182ca8bf HIP allows conserved currents. Tracing replaces self timeing Peter Boyle 2022-08-31 17:11:18 -04:00
  • ee2d7369b3 Tracing replaces self timing Peter Boyle 2022-08-31 17:10:45 -04:00
  • 7c686d29c9 Tracing replaces self timing Peter Boyle 2022-08-31 17:10:17 -04:00
  • e8a0a1e75d Tracing replaces self timing hooks Peter Boyle 2022-08-31 17:09:47 -04:00
  • 730be89abf Remove timing hooks as tracing replaces Peter Boyle 2022-08-31 17:08:44 -04:00
  • f991ad7d5c Remove timing hooks as tracing replaces Peter Boyle 2022-08-31 17:08:18 -04:00
  • b3f33f82f7 Decrease self timing hooks, use nvtx / roctx type tracing hooks instead Peter Boyle 2022-08-31 17:06:47 -04:00
  • a34a6e059f Logging improvement. Sinitial will be used to improve RHMC terms Peter Boyle 2022-08-31 17:06:08 -04:00
  • 1333319941 Tracing Peter Boyle 2022-08-31 17:00:25 -04:00
  • 9295ed8d20 Print full memory range Peter Boyle 2022-08-31 16:59:51 -04:00
  • 19cc7653fb Tracing Peter Boyle 2022-08-31 16:57:51 -04:00
  • 5752538661 Tracing Peter Boyle 2022-08-31 16:57:32 -04:00
  • ca40a1b00b Tracing Peter Boyle 2022-08-31 16:54:55 -04:00
  • 659fac9dfb Tracing hook Peter Boyle 2022-08-31 16:54:25 -04:00
  • 4dc3d6fce0 Buy into Nvidia/Rocm etc... tracing. Peter Boyle 2022-08-31 16:53:19 -04:00
  • 60dfb49afa Remove FP16 tests when FP16 is disabled Gurtej Kanwar 2022-08-21 17:29:55 +02:00
  • 554c238359 Update OpenSSL digest to use high-level methods Gurtej Kanwar 2022-08-21 17:28:57 +02:00
  • f922adf05e Fix Photon ComplexField type Gurtej Kanwar 2022-08-21 16:16:18 +02:00
  • 95b640cb6b 10TF/s on 32^3 x 64 on single node Peter Boyle 2022-08-04 15:43:52 -04:00
  • 2cb5bedc15 Copy stream HIP improvements Peter Boyle 2022-08-04 15:24:03 -04:00