1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-01 17:56:08 +01:00

Commit Graph

  • 21371a7e5b Tracing replaces self timing Peter Boyle 2022-08-31 17:16:05 -04:00
  • abfaa00d3e Tracing replaces self timing Peter Boyle 2022-08-31 17:15:24 -04:00
  • efee33c55d Tracing replaces self timing Peter Boyle 2022-08-31 17:14:57 -04:00
  • db0fe6ddbb Tracing replaces self timinng Peter Boyle 2022-08-31 17:14:14 -04:00
  • 8a9e647120 Tracing replaces self timing Peter Boyle 2022-08-31 17:13:44 -04:00
  • e6dcb821ad Tracing replaces self timing Peter Boyle 2022-08-31 17:12:31 -04:00
  • 9bff188f02 Tracing replaces self timing Peter Boyle 2022-08-31 17:12:05 -04:00
  • 111b30ca1d Tracing replaces self timing Peter Boyle 2022-08-31 17:11:48 -04:00
  • 24182ca8bf HIP allows conserved currents. Tracing replaces self timeing Peter Boyle 2022-08-31 17:11:18 -04:00
  • ee2d7369b3 Tracing replaces self timing Peter Boyle 2022-08-31 17:10:45 -04:00
  • 7c686d29c9 Tracing replaces self timing Peter Boyle 2022-08-31 17:10:17 -04:00
  • e8a0a1e75d Tracing replaces self timing hooks Peter Boyle 2022-08-31 17:09:47 -04:00
  • 730be89abf Remove timing hooks as tracing replaces Peter Boyle 2022-08-31 17:08:44 -04:00
  • f991ad7d5c Remove timing hooks as tracing replaces Peter Boyle 2022-08-31 17:08:18 -04:00
  • b3f33f82f7 Decrease self timing hooks, use nvtx / roctx type tracing hooks instead Peter Boyle 2022-08-31 17:06:47 -04:00
  • a34a6e059f Logging improvement. Sinitial will be used to improve RHMC terms Peter Boyle 2022-08-31 17:06:08 -04:00
  • 1333319941 Tracing Peter Boyle 2022-08-31 17:00:25 -04:00
  • 9295ed8d20 Print full memory range Peter Boyle 2022-08-31 16:59:51 -04:00
  • 19cc7653fb Tracing Peter Boyle 2022-08-31 16:57:51 -04:00
  • 5752538661 Tracing Peter Boyle 2022-08-31 16:57:32 -04:00
  • ca40a1b00b Tracing Peter Boyle 2022-08-31 16:54:55 -04:00
  • 659fac9dfb Tracing hook Peter Boyle 2022-08-31 16:54:25 -04:00
  • 4dc3d6fce0 Buy into Nvidia/Rocm etc... tracing. Peter Boyle 2022-08-31 16:53:19 -04:00
  • 60dfb49afa Remove FP16 tests when FP16 is disabled Gurtej Kanwar 2022-08-21 17:29:55 +02:00
  • 554c238359 Update OpenSSL digest to use high-level methods Gurtej Kanwar 2022-08-21 17:28:57 +02:00
  • f922adf05e Fix Photon ComplexField type Gurtej Kanwar 2022-08-21 16:16:18 +02:00
  • 95b640cb6b 10TF/s on 32^3 x 64 on single node Peter Boyle 2022-08-04 15:43:52 -04:00
  • 2cb5bedc15 Copy stream HIP improvements Peter Boyle 2022-08-04 15:24:03 -04:00
  • 806b02bddf Simplify dead code Peter Boyle 2022-08-04 15:23:13 -04:00
  • de40395773 More timing. Think I should start to use nvtx and rocmtx ?? Peter Boyle 2022-08-04 13:37:16 -04:00
  • 7ba4788715 Fix Peter Boyle 2022-08-04 13:36:44 -04:00
  • 06d9ce1a02 Synch ranks on node here for GPU - GPU memcopy Peter Boyle 2022-08-04 13:35:56 -04:00
  • 75bb6b2b40 Move barrier into the StencilSend begin routine Peter Boyle 2022-08-04 13:35:26 -04:00
  • 74f10c2dc0 Move barrier into Stencil Send Peter Boyle 2022-08-04 13:34:11 -04:00
  • 188d2c7a4d PVC default, ignore ATS Peter Boyle 2022-08-02 08:38:53 -07:00
  • 17d7177105 Files for SYCL Peter Boyle 2022-08-02 08:33:39 -07:00
  • bb0a0da47a inon blocking caution due to SYCL Peter Boyle 2022-08-02 08:09:43 -07:00
  • 84110166e4 Fix the fence Peter Boyle 2022-08-02 08:00:43 -07:00
  • d32b923b6c Fencing on a stream in SYCL is needed. Didn't know that ... gulp Peter Boyle 2022-08-02 07:58:04 -07:00
  • a93d5459d4 Better mpi request completion Peter Boyle 2022-07-28 12:18:35 -04:00
  • 9c21add0c6 High res timer replaces getttimeofday Peter Boyle 2022-07-28 12:14:03 -04:00
  • 639aab6563 High res timer instead of gettimeofday Peter Boyle 2022-07-28 12:13:35 -04:00
  • 8137cc7049 Allways concurrent comms Peter Boyle 2022-07-28 12:01:51 -04:00
  • 60e63dca1d Add memory logging channel Peter Boyle 2022-07-28 11:39:15 -04:00
  • 486409574e Expanded cach to avoid any allocs in HMC Peter Boyle 2022-07-28 11:38:34 -04:00
  • a913b8be12 Dslash self timing. Might want to not have this Peter Boyle 2022-07-28 11:37:55 -04:00
  • 2239751850 Better logging Peter Boyle 2022-07-28 11:37:36 -04:00
  • 9b20f1449c Better timing Peter Boyle 2022-07-28 11:37:12 -04:00
  • b99453083d Updated timing Peter Boyle 2022-07-28 11:37:02 -04:00
  • 2ab1af5754 Ensure no synchronize and not optoin dependent Peter Boyle 2022-07-19 09:51:06 -07:00
  • 5f8892bf03 Mistake pointed out by Camilo Peter Boyle 2022-07-19 09:31:51 -07:00
  • f14e7e51e7 Grid accelerator Peter Boyle 2022-07-12 10:56:22 -07:00
  • 943fbb914d Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet Peter Boyle 2022-07-11 13:48:42 -04:00
  • ca4603580d Verbose Peter Boyle 2022-07-11 13:48:35 -04:00
  • f73db8f1f3 Synch clocks Peter Boyle 2022-07-11 13:47:39 -04:00
  • f7217d12d2 World barrier for clock synch Peter Boyle 2022-07-11 13:45:31 -04:00
  • fab50c57d9 More loggin Peter Boyle 2022-07-11 18:42:27 +01:00
  • 3440534fbf MixedPrec support Peter Boyle 2022-07-10 21:35:18 +01:00
  • 177b1a7ec6 Mixed prec Peter Boyle 2022-07-10 21:34:10 +01:00
  • 58182fe345 Different approach to default dirichlet params Peter Boyle 2022-07-10 21:32:58 +01:00
  • 1f907d330d Different default params for dirichlet Peter Boyle 2022-07-10 21:31:48 +01:00
  • b0fe664e9d Better force log info Peter Boyle 2022-07-10 21:31:25 +01:00
  • c0f8482402 Remove SSC marks Peter Boyle 2022-07-07 17:49:36 +01:00
  • 3544965f54 Stream doesn't work Peter Boyle 2022-07-07 17:49:20 +01:00
  • 33e4a0caee Imported changes from feature/gparity_HMC branch: Rework of WilsonFlow class Fixed logic error in smear method where the step index was initialized to 1 rather than 0, resulting in the logged output value of tau being too large by epsilon Previously smear_adaptive would maintain the current value of tau as a class member variable whereas smear would compute it separately; now both methods maintain the current value internally and it is updated by the evolve_step routines. Both evolve methods are now const. smear_adaptive now also maintains the current value of epsilon internally, allowing it to be a const method and also allowing the same class instance to be reused without needing to be reset Replaced the fixed evaluation of the plaquette energy density and plaquette topological charge during the smearing with a highly flexible general strategy where the user can add arbitrary measurements as functional objects that are evaluated at an arbitrary frequency By default the same plaquette-based measurements are performed, but additional example functions are provided where the smearing is performed with different choices of measurement that are returned as an array for further processing Added a method to compute the energy density using the Cloverleaf approach which has smaller discretization errors Added a new tensor utility operation, copyLane, which allows for the copying of a single SIMD lane between two instances of the same tensor type but potentially different precisions To LocalCoherenceLanczos, added the option to compute the high/low eval of the fine operator on every restart to aid in tuning the Chebyshev Added Test_field_array_io which demonstrates and tests a single-file write of an arbitrary array of fields Added Test_evec_compression which generates evecs using Lanczos and attempts to compress them using the local coherence technique Added Test_compressed_lanczos_gparity which demonstrates the local coherence Lanczos for G-parity BCs Added HMC main programs for the 40ID and 48ID G-parity lattices Christopher Kelly 2022-07-01 14:10:59 -04:00
  • 1f903d9296 Merge branch 'feature/dirichlet' into feature/dirichlet-gparity Peter Boyle 2022-07-01 12:12:50 -04:00
  • 4df1e0987f Merge branch 'feature/dirichlet-gparity' of https://github.com/paboyle/Grid into feature/dirichlet-gparity Peter Boyle 2022-07-01 09:55:43 -04:00
  • 588c2f3cb1 Faster axpy_norm and innerProduct Peter Boyle 2022-07-01 09:44:58 -04:00
  • bd99fd608c Introduce a non-default stream for compute operatoins Peter Boyle 2022-07-01 09:42:53 -04:00
  • 57b442d0de Log memory operations Peter Boyle 2022-07-01 09:42:17 -04:00
  • 751a4562d7 Timing improvement Peter Boyle 2022-07-01 09:41:43 -04:00
  • ca66301dee Remove debug Peter Boyle 2022-06-30 14:53:12 -04:00
  • 808bb59206 Mixed prec DD-RHMC Peter Boyle 2022-06-30 13:50:09 -04:00
  • 4b7f51d19d Create a new RNG file Peter Boyle 2022-06-30 13:49:50 -04:00
  • d03152fac4 New file under debug Peter Boyle 2022-06-30 13:49:35 -04:00
  • 137f190258 Dirichlet implementation Peter Boyle 2022-06-30 13:45:07 -04:00
  • 53d01312b3 Rough flop counting, need to add M5D, M5Ddag, MooeeInv flops Peter Boyle 2022-06-30 13:44:09 -04:00
  • 220050822a Speed up M5D and M5Ddag Peter Boyle 2022-06-30 13:43:27 -04:00
  • 87ad76d81b Initialise timeval Peter Boyle 2022-06-30 13:42:46 -04:00
  • 042ab1a052 Update GridStd.h Peter Boyle 2022-06-27 13:21:39 -04:00
  • 4ac1094856 Updated config commands Peter Boyle 2022-06-27 12:16:24 -04:00
  • d44a57b0af Allow frequency=0 to disable Peter Boyle 2022-06-27 12:15:55 -04:00
  • dc000d10ee Spelling correction Peter Boyle 2022-06-27 12:14:57 -04:00
  • 3685f391cf More verbose CG Peter Boyle 2022-06-27 12:11:08 -04:00
  • efd7338a00 Allow dirichlet at round the world link Peter Boyle 2022-06-27 12:10:27 -04:00
  • e1e7b1e224 RNG fix Peter Boyle 2022-06-27 12:09:52 -04:00
  • 7319d4e1ad Merge pull request #407 from giltirn/feature/dirichlet-gparity-stage Peter Boyle 2022-06-22 15:23:36 -04:00
  • fd933420c6 Imported changes from feature/gparity_HMC branch: Added a bounds-check function for the RHMC with arbitrary power Added a pseudofermion action for the rational ratio with an arbitrary power and a mixed-precision variant of the same. The existing one-flavor rational ratio class now uses the general class under the hood To support testing of the two-flavor even-odd ratio pseudofermion, separated the functionality of generating the random field and performing the heatbath step, and added a method to obtain the pseudofermion field Added a new HMC runner start type: CheckpointStartReseed, which reseeds the RNG from scratch, allowing for the creation of new evolution streams from an existing checkpoint. Added log output of seeds used when the RNG is seeded. EOFA changes: To support mixed-precision inversion, generalized the class to maintain a separate solver for the L and R operators in the heatbath (separate solvers are already implemented for the other stages) To support mixed-precision, the action of setting the operator shift coefficients is now maintained in a virtual function. A derived class for mixed-precision solvers ensures the coefficients are applied to both the double and single-prec operators The ||^2 of the random source is now stored by the heatbath and compared to the initial action when it is computed. These should be equal but may differ if the rational bounds are not chosen correctly, hence serving as a useful and free test Fixed calculation of M_eofa (previously incomplete and #if'd out) Added functionality to compute M_eofa^-1 to complement the calculation of M_eofa (both are equally expensive!) To support testing, separated the functionality of generating the random field and performing the heatbath step, and added a method to obtain the pseudofermion field Added a test program which computes the G-parity force using the 1 and 2 flavor implementations and compares the result. Test supports DWF, EOFA and DSDR actions, chosen by a command line option. The Mobius EOFA force test now also checks the rational approximation used for the heatbath Added a test program for the mixed precision EOFA compared to the double-prec implementation, G-parity HMC test now applied GPBC in the y direction and not the t direction (GPBC in t are no longer supported) and checkpoints after every configuration Added a test program which computes the two-flavor G-parity action (via RHMC) with both the 1 and 2 flavor implementations and checks they agree Added a test program to check the implementation of M_eofa^{-1} Christopher Kelly 2022-06-22 10:27:48 -04:00
  • 8208a6214f Merge branch 'feature/dirichlet-gparity' into feature/dirichlet Peter Boyle 2022-06-15 19:23:48 -04:00
  • 3d8146b596 Merge branch 'feature/dirichlet-gparity' of https://github.com/paboyle/Grid into feature/dirichlet-gparity Peter Boyle 2022-06-15 19:20:27 -04:00
  • 31efa5c4da Script updates for current summit Peter Boyle 2022-06-15 19:19:44 -04:00
  • d10d30dda8 Script update Peter Boyle 2022-06-15 19:18:58 -04:00
  • 0e9666bc92 Test update Peter Boyle 2022-06-15 19:18:42 -04:00
  • 6efd80f104 Printing Peter Boyle 2022-06-15 18:23:46 -04:00
  • fdef7a1a8c Dirichlet fix Peter Boyle 2022-06-15 00:05:20 -04:00
  • 501bb117bf Const correct Peter Boyle 2022-06-15 00:04:09 -04:00
  • 05ca7dc252 Const correctness Peter Boyle 2022-06-14 23:41:05 -04:00
  • e9648a1635 Useful periodic print. CG convergence bound is remarkably accurate on low eigenvalue in numerical tests Peter Boyle 2022-06-14 23:40:04 -04:00
  • 2df98a99bc Merge pull request #406 from giordano/patch-1 Peter Boyle 2022-06-14 17:46:25 -04:00
  • 315ea18be2 Update default value of gen-simd-width in README Mosè Giordano 2022-06-14 22:41:05 +01:00