1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-08-15 02:31:53 +01:00

Commit Graph

  • 03e54722c1 Merge branch 'develop' into bugfix/LatTransfer Michael Marshall 2021-03-03 16:13:23 +00:00
  • 442336bd96 Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case. Other cases to do. This now includes comms code path. Peter Boyle 2021-03-02 14:50:51 +01:00
  • 9c9566b9c9 Merge pull request #23 from paboyle/develop Christoph Lehner 2021-03-01 12:33:51 +01:00
  • 1059a81a3c Merge branch 'develop' into bugfix/LatTransfer Michael Marshall 2021-02-27 00:21:36 +00:00
  • 2e61556389 Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2021-02-26 17:52:20 +01:00
  • f9b1f240f6 Better SIMD usage/coalescence Peter Boyle 2021-02-26 17:51:41 +01:00
  • 69f41469dd Merge branch 'develop' into bugfix/LatTransfer Michael Marshall 2021-02-25 09:19:17 +00:00
  • d620b303ff Merge branch 'develop' into feature/mres_schur Michael Marshall 2021-02-24 18:07:27 +00:00
  • 157fd1428d Merge pull request #342 from paboyle/feature/link-update-mask Peter Boyle 2021-02-24 11:29:52 -05:00
  • c791cb2214 Merge branch 'develop' into feature/link-update-mask Christopher Kelly 2021-02-23 11:51:54 -05:00
  • d5ab571a89 Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links Christopher Kelly 2021-02-23 11:49:56 -05:00
  • 0ed800f6e4 merge develop Felix Erben 2021-02-23 14:54:46 +00:00
  • 0a32183825 Merge pull request #335 from felixerben/gpu/baryons Peter Boyle 2021-02-23 09:30:16 -05:00
  • 2cacfbde2a Merge pull request #341 from DanielRichtmann/fix/minor-things Peter Boyle 2021-02-22 09:28:50 -05:00
  • c073e62e0b Correct misleading ac help string Daniel Richtmann 2021-02-22 15:17:07 +01:00
  • e3d019bc2f Enable performance counting in WilsonFermion like in others Daniel Richtmann 2021-02-22 14:56:52 +01:00
  • 7ae030f585 changed back A2AUtils warning Felix Erben 2021-02-18 13:24:50 +00:00
  • 86b58d5aff changed if and accelerator_for - no runtime errors any more Felix Erben 2021-02-18 12:04:32 +00:00
  • 26e8b9f4a5 Merge pull request #340 from mmphys/bugfix/config Peter Boyle 2021-02-17 11:56:21 -05:00
  • 35114c9e62 Mac OS (Darwin) sed -i flag for in-place editing differs from posix / gnu Michael Marshall 2021-02-17 13:24:15 +00:00
  • e0f6a146d8 To DWF+I G-parity evolution code, added ability to specify number of MD steps in params and an optional usage mode that reads the config and checks the plaq/checksum agree then exits Christopher Kelly 2021-02-16 10:41:52 -05:00
  • dfd28a85c9 Merge pull request #339 from mmphys/bugfix/config Peter Boyle 2021-02-15 13:53:26 -05:00
  • a503332924 Seems the intention with AutoConf produced Grid/Config.h was to use sed to translate standard PACKAGE_ #defines into GRID_ however due to missing '' after -i this hasn't been working. Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons. Michael Marshall 2021-02-14 21:27:54 +00:00
  • 9295eeadfe Optoin to use GpuComplex iin Wilson kernel u61464 2021-02-10 06:51:23 -08:00
  • 36f471e333 Unrolled loops u61464 2021-02-09 16:09:23 -08:00
  • ca4eadd4ab Sycl kernels Peter Boyle 2021-02-09 14:36:22 -05:00
  • daa095c519 Fixed an obscure but reproducible hang in the RHMC caused by the bounds check being activated by a random number that wasn't synchronized over the nodes HMC now also reports the "L-infinity norm" of the impulse, aka the largest site norm Christopher Kelly 2021-02-09 12:55:46 -05:00
  • d954595922 SyCL optimised hand unrolled kernels and const functor patches. Peter Boyle 2021-02-09 11:39:39 -05:00
  • c2676853ca Merge branch 'bugfix/maxnorm2' into feature/gparity_HMC Christopher Kelly 2021-02-08 12:17:33 -05:00
  • 1ac13ec3a7 Merge pull request #338 from paboyle/bugfix/maxnorm2 Peter Boyle 2021-02-08 12:08:11 -05:00
  • 55de69a569 Fixed compile issues with maxLocalNorm2 for non-scalar lattices maxLocalNorm2 test now reuses the random field Christopher Kelly 2021-02-08 12:03:16 -05:00
  • eda9ab487b MADWF 5d source option for hadrons - look at Grid of source Abort on GPU error Peter Boyle 2021-02-08 10:47:22 -05:00
  • 6a824033f8 Merge branch 'develop' into feature/gparity_HMC Christopher Kelly 2021-02-08 09:31:49 -05:00
  • cee6a37639 Added a logging tag for HMC As the integrator logger is active by default the cmdline option to activate had no effect. Changed option to *de*activate on request ("NoIntegrator") Cleaned up generating rational approxs in the general RHMC code As the tolerance of the rational approx is not related to the CG tolerance, regenerating approxs for MD and MC if they differ only by the CG tolerance is not necessary; this has been fixed In DWF+I Gparity evolution code, added cmdline options to check the rational approximations and compute the lowest/highest eigenvalues of M^dagM for RHMC tuning In the above, changed the integrator layout to a much simpler one that completes much faster; may need additional tuning Christopher Kelly 2021-02-08 09:30:35 -05:00
  • cd99edcc5f maxLocalNorm2() Peter Boyle 2021-02-04 18:25:49 -05:00
  • 4705aa541d Allow user to configure ShmDims via environment variables Christoph Lehner 2021-02-04 14:25:55 +01:00
  • 3215d88a91 Simplify syntax with Grid::EnableIf post code review. Updated EnableIf so that ReturnType defaults to void in same way as std::enable_if see https://en.cppreference.com/w/cpp/types/enable_if Michael Marshall 2021-02-03 15:17:03 +00:00
  • 9b9a53f870 ... Felix Erben 2021-02-02 13:06:43 +00:00
  • 019ffe17d4 Allow for GPU vector width beyond 64 Christoph Lehner 2021-02-02 11:32:23 +01:00
  • 6cc3ad110c Improved logging output for RHMC bounds checks In GenericHMCRunner, exposed functionality for initializing gauge fields and RNG for external use Christopher Kelly 2021-01-29 12:35:00 -05:00
  • bc496dd844 change back benchmark_ITT Felix Erben 2021-01-28 14:29:56 +00:00
  • a673b6a54d prettify Felix Erben 2021-01-28 14:15:09 +00:00
  • 1bf2e4d187 Merge branch 'develop' into gpu/baryons Felix Erben 2021-01-27 21:17:37 +00:00
  • 96dd7a8fbd Flop cout matches DiRAC-ITT-2020 Peter Boyle 2020-11-16 17:15:34 +01:00
  • 7905afa9f5 revert changes Felix Erben 2021-01-19 12:32:48 +00:00
  • 712bb40650 merge develop Felix Erben 2020-12-15 16:33:29 +00:00
  • 81d88d9f4d fixes Felix Erben 2021-01-27 21:09:51 +00:00
  • e6c6f82c52 Gparity DWF+I HMC main program now has option to specify parameter file Christopher Kelly 2021-01-27 11:18:41 -05:00
  • d10d0c4e7f Merge branch 'develop' into feature/gparity_HMC Christopher Kelly 2021-01-25 15:13:29 -05:00
  • 9c106d625a Added HMC main program designed to reproduce the 16^3x32x16 DWF+I ensembles with beta=2.13 and Gparity BCs Christopher Kelly 2021-01-25 15:07:44 -05:00
  • 6795bbca31 Generalized GeneralEvenOddRatioRationalPseudoFermionAction such that the multi-shift CG algorithm can be overridden by derived classes Added a mixed-precision variant of GeneralEvenOddRatioRationalPseudoFermionAction and a verification test against double prec class Fixed non-const reference used in passing RHMC approx to multishift classes Christopher Kelly 2021-01-25 14:22:31 -05:00
  • 77063418da Fix issue for GPU by ensuring accelerator_inline version of convertType is available for Grid::complex<T>. This removes many warnings in Hadrons Simplify the SFINAE syntax and correct convertType for iScalar Michael Marshall 2021-01-25 15:09:36 +00:00
  • 2983b6fdf6 Optional (superficial) changes to make comparison with Hadrons WardIdentity module easier: use Schur solver; example of Hadrons random gauge init; logging updates; only solve reverse propagator if provided Michael Marshall 2021-01-23 12:41:48 +00:00
  • 69f1f04f74 Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2021-01-21 21:39:59 -05:00
  • 11a5fd09d6 Hot config Peter Boyle 2021-01-21 21:39:41 -05:00
  • ff1fa98808 Fix for GPU conserveed current Peter Boyle 2021-01-21 21:38:23 -05:00
  • d161c2dc35 Improved formating of timing output in mixed-prec multishift In test of mixed-prec multishift, added comparison against full double precision multishift both for timing and to cross-check the results Christopher Kelly 2021-01-20 15:42:06 -05:00
  • 7a06826cf1 Added option to NerscIO to disable exit on failing plaquette check allowing for circumvention of factor of 2 error in CPS-generated G-parity config headers Adapted mixed-prec multi-shift test to new way to pass gauge BC directions and added cmdline option to perform the G-parity plaquette comparison with the corrected plaquette when loading config Christopher Kelly 2021-01-20 13:31:50 -05:00
  • c3712b8e06 Merge branch 'develop' into feature/gparity_HMC Christopher Kelly 2021-01-20 11:48:52 -05:00
  • 901ee77b84 Mixed precision multishift test can now be performed with/without G-parity using cmdline check and can load a pregenerated configuration Christopher Kelly 2021-01-20 11:45:44 -05:00
  • df16202865 weird bug in 2pt function... Felix Erben 2021-01-19 19:25:27 +00:00
  • 3ff7c2c02a Merge branch 'develop' into gpu/baryons Felix Erben 2021-01-19 12:34:13 +00:00
  • fc6d07897f revert changes Felix Erben 2021-01-19 12:32:48 +00:00
  • f9c8e5c8ef Merge branch 'develop' of github.com:paboyle/Grid into develop Felix Erben 2021-01-19 12:30:29 +00:00
  • 8bfa0e74f8 final version, tested on CPU and GPU Felix Erben 2021-01-19 12:27:57 +00:00
  • 9b73a937e7 bugfix Felix Erben 2021-01-18 18:57:05 +00:00
  • b0339bc5a4 Merge branch 'feature/conjugate-bc-dirs' into develop Peter Boyle 2021-01-15 09:28:39 -05:00
  • 3c23a947cc Fixed test for very much non-unit det feature/conjugate-bc-dirs Peter Boyle 2021-01-15 09:16:02 -05:00
  • 56111bb823 Merge branch 'develop' into feature/conjugate-bc-dirs Peter Boyle 2021-01-14 21:01:22 -05:00
  • 99445673f6 Gparity fix, and plaquette IO Peter Boyle 2021-01-14 21:00:36 -05:00
  • 97a59643f7 Red black coarse space Peter Boyle 2021-01-14 20:49:13 -05:00
  • 579595f547 Red black on coarse space Peter Boyle 2021-01-14 20:48:35 -05:00
  • 281ac5fc12 Red black support on coars Peter Boyle 2021-01-14 20:48:08 -05:00
  • d8fa903b02 G5 on coarse spaces Peter Boyle 2021-01-14 20:47:28 -05:00
  • eaff0f3aeb Gamma5 on coaree spaces Peter Boyle 2021-01-14 20:46:58 -05:00
  • e8e20c01b2 Coarsened vector test Peter Boyle 2021-01-14 20:46:21 -05:00
  • a4afc3ea2a Red black coarse space Peter Boyle 2021-01-14 20:44:16 -05:00
  • fa12b9a329 bugfix Felix Erben 2021-01-13 10:04:17 +00:00
  • 45fc7ded3a test for sum Felix Erben 2021-01-12 09:10:37 +00:00
  • 74de2d9742 whitespace changes Felix Erben 2021-01-08 18:28:36 +00:00
  • e759367d42 tested and working Felix Erben 2021-01-08 18:04:50 +00:00
  • 1b84f59273 Added a mixed precision multishift algorithm for which the matrix multiplies are performed in single precision but the search directions are accumulated in double precision. A reliable update step is performed at a tunable frequency to correct the residual. A final mixed-prec single-shift solve is performed on each pole to perform cleanup if necessary. A test is provided to demonstrate the algorithm. Christopher Kelly 2021-01-06 12:21:30 -05:00
  • 1fb41a4300 Added copyLane function to Tensor_extract_merge.h which copies one lane of data from an input tensor object to a different lane of an output tensor object of potentially different precision precisionChange lattice function now uses copyLane to remove need for temporary scalar objects, reducing register footprint and significantly improving performance Christopher Kelly 2021-01-06 11:50:56 -05:00
  • 287bac946f ConjugateGradientMixedPrec now stores final true residual and uses the precisionChange workspaces for improved efficiency Christopher Kelly 2021-01-06 09:50:41 -05:00
  • 80c14be65e Added core test to check precision change Christopher Kelly 2021-01-06 09:34:44 -05:00
  • d7a2a4852d Reimplemented precisionChange to run on GPUs. A workspace containing the mapping table can be optionally precomputed and reused for improved performance. Christopher Kelly 2021-01-06 09:30:49 -05:00
  • d185f2eaa7 OneFlavourEvenOddRatioRationalPseudoFermionAction now derives from GeneralEvenOddRatioRationalPseudoFermionAction, simply performs transcription of parameters Christopher Kelly 2020-12-23 16:26:10 -05:00
  • 813d4cd900 Added test program that ensures the generic checkerboarded RHMC (with parameters set appropriately) gives the same answer as the existing 1f code Christopher Kelly 2020-12-23 16:01:42 -05:00
  • 75c6c6b173 General RHMC pseudofermion action now allows for different rational approximations to be used in the MD and action evaluation Christopher Kelly 2020-12-23 11:19:26 -05:00
  • 299d0de066 Merge pull request #21 from paboyle/develop Christoph Lehner 2020-12-22 20:59:15 +01:00
  • 220ad5e3ee Added more verbose log output to GeneralEvenOddRatioRationalPseudoFermionAction In GeneralEvenOddRatioRationalPseudoFermionAction, setting the bounds check frequency to 0 now disables the check Christopher Kelly 2020-12-22 11:08:22 -05:00
  • ba5dc670a5 Reimplemented GparityWilsonImpl::InsertForce5D to run efficiently on GPUs Swapped order of templated tensor code and c-number specializations in Tensor_outer.h to fix compile issue with type deduction on Summit Christopher Kelly 2020-12-22 10:10:07 -05:00
  • 3fe75bc7cb Merge pull request #329 from nmeyer-ur/feature/a64fx-3 Peter Boyle 2020-12-20 08:17:15 -05:00
  • 45d49d8648 clean up Nils Meyer 2020-12-19 03:35:18 +01:00
  • 6013183361 removed Asm impls Nils Meyer 2020-12-19 03:25:01 +01:00
  • 4b882e8056 fixed lost bracket Nils Meyer 2020-12-19 03:09:20 +01:00
  • 3f9ae6e7e7 Merge branch 'develop' into feature/a64fx-3 Nils Meyer 2020-12-19 02:37:11 +01:00
  • 909acd55cd vnum variant for prefetches Nils Meyer 2020-12-19 02:00:22 +01:00
  • 4dd9e39e0d up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1 Nils Meyer 2020-12-19 00:54:31 +01:00
  • b4c1317ab4 Merge pull request #22 from DanielRichtmann/feature/clover-access-specifier Christoph Lehner 2020-12-18 16:20:19 +01:00