1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-23 10:12:02 +01:00

Commit Graph

  • 345721220e resolved merge conflict nils meyer 2020-04-24 10:14:21 +02:00
  • 6db68d6ecb added SVE configure for armclang and gcc nils meyer 2020-04-24 10:10:47 +02:00
  • dae820aa96 Merge pull request #277 from mmphys/bugfix/grid-config Peter Boyle 2020-04-23 10:26:54 -04:00
  • 5daf176f4a Updated to expose GRID_CXXLD in addition to CXXLD. NB: CXXLD required as this is what drives linking behaviour. Michael Marshall 2020-04-23 15:25:53 +01:00
  • e96c86ec14 Make grid-config message more specific for --cxx and --cxxld Michael Marshall 2020-04-23 13:10:45 +01:00
  • 09f0963d1f changes in configure.ac ; to be verified nmeyer-ur 2020-04-23 11:27:03 +02:00
  • 6f44e3c192 reverted changes in configure.ac ; included SVE configure readme nils meyer 2020-04-23 11:18:50 +02:00
  • c2c3cad20d Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2020-04-23 04:35:42 -04:00
  • edec9ee2e2 Conserved current rewrite done. Zmobius working Peter Boyle 2020-04-23 04:34:01 -04:00
  • ed70cce542 Test for 5D DWF obserevables Peter Boyle 2020-04-23 04:29:45 -04:00
  • 4701201b5f grid-config: Expose CXXLD (for GPU build) and update help Michael Marshall 2020-04-22 18:42:30 +01:00
  • 5893888f87 removed default no-strict-aliasing for gcc-10.0.1 exclusively nils meyer 2020-04-22 19:29:55 +02:00
  • 39b448affb Merge remote-tracking branch 'origin/develop' into feature/a64fx-2 nmeyer-ur 2020-04-22 17:34:12 +02:00
  • e54a8f05a9 Exchange1 with generic version for now, should use svtbl2 in final version nils meyer 2020-04-20 22:45:27 +02:00
  • 0782b76ed4 Merge pull request #274 from paboyle/feature/zmobius_paramcompute Peter Boyle 2020-04-20 14:39:29 -04:00
  • 0896f2cead Added missing include guards in bigfloat_double.h Christopher Kelly 2020-04-20 10:30:38 -04:00
  • 181709bba4 Merge branch 'develop' into feature/zmobius_paramcompute Christopher Kelly 2020-04-20 09:12:34 -04:00
  • 64b72fc17f testing gcc 10.0.1: build errors in Exchange1 using -DA64FX and in Lattice_base.h building Dslash only nils meyer 2020-04-19 01:25:40 +02:00
  • 091d5c605e towards more precise blocking Christoph Lehner 2020-04-17 04:25:28 -04:00
  • 6fdce60492 revised BodyA64FX; 990 GiB/s Wilson, 687 GiB/s DW using intrinsics (armclang 20.0) nils meyer 2020-04-16 22:43:32 +02:00
  • 90229cfb0f Merge pull request #270 from milc-qcd/feature/CGinfo Peter Boyle 2020-04-16 11:46:08 -04:00
  • 0475c46ecb Merge pull request #256 from djm2131/feature/BiCGSTAB Peter Boyle 2020-04-16 11:45:15 -04:00
  • 3cca10e617 Merge pull request #276 from nils-asmussen/fix/regression_nt Peter Boyle 2020-04-16 11:42:39 -04:00
  • 327da332bb Merge branch 'develop' of https://github.com/paboyle/Grid into feature/gpt Christoph Lehner 2020-04-16 11:30:17 -04:00
  • 852db4626a re-introduced HOTFIX cause Grid binaries give wrong results otherwise; checked in good gridverter.py nils meyer 2020-04-15 18:22:19 +02:00
  • 43dc2814dd fix regression in core/Test_qed.cc Nils Asmussen 2020-04-15 16:10:15 +01:00
  • 6504a098cc 999 GiB/s Wilson; 694 GiB/s DW (DP) nils meyer 2020-04-15 15:06:52 +02:00
  • 79a385faca disabled armclang hotfix cause armclang 20.0 performance gets a little nils meyer 2020-04-15 11:46:55 +02:00
  • c12a67030a 980 GiB/s Wilson; 680 GiB/s DW (DP) nils meyer 2020-04-15 10:55:06 +02:00
  • 581392f2f2 now with pf, best results so far using intrinsics+pf nils meyer 2020-04-12 22:06:14 +02:00
  • 113f277b6a enable dslash asm using -DA64FXASM, additionaly -DDSLASHINTRIN for intrinsics impl nils meyer 2020-04-11 04:55:01 +02:00
  • f3a8d039a2 Merge branch 'feature/hdcr' into develop Peter Boyle 2020-04-10 22:01:52 -04:00
  • 974586bedc Dslash finally works; cleaned up; uses MOVPRFX in assembly nils meyer 2020-04-10 22:26:40 +02:00
  • 4e864e56c9 develop pull Antonin Portelli 2020-04-10 17:19:18 +01:00
  • 014dbfa464 Compile fix with OpDirAll Peter Boyle 2020-04-10 11:57:09 -04:00
  • 3b0e07882f Adding another form of polynomial Peter Boyle 2020-04-10 11:28:33 -04:00
  • 8e81a811d0 Merge branch 'feature/hdcr' into develop Peter Boyle 2020-04-10 11:14:49 -04:00
  • aa13118127 Missing conjugate already fixed in develop feature/hdcr Peter Boyle 2020-04-10 11:11:24 -04:00
  • 6cdb09c884 Faster copy region Peter Boyle 2020-04-10 11:10:52 -04:00
  • a65bc64f10 Accelerator peek poke Peter Boyle 2020-04-10 11:09:59 -04:00
  • 11dec4883c Don't throw assert Peter Boyle 2020-04-10 11:09:11 -04:00
  • afa458c812 Extra solvers Peter Boyle 2020-04-10 11:08:19 -04:00
  • dc50190b8f Faster GPU basis rotation May need to later include Regensburg optimised CPU variant Peter Boyle 2020-04-10 11:06:04 -04:00
  • 160f78c1e4 changed debug output to variable direct 3 nmeyer-ur 2020-04-10 12:23:07 +02:00
  • 7e4e1bbbc2 changed debug output to variable direct 2 nmeyer-ur 2020-04-10 12:22:04 +02:00
  • e699b7e9f9 changed debug output to variable direct nmeyer-ur 2020-04-10 12:18:30 +02:00
  • a28bc0de90 debug register address test in WilsonHand nmeyer-ur 2020-04-10 12:07:45 +02:00
  • 14d0fe4d6c added predication in WilsonHand nmeyer-ur 2020-04-10 12:04:00 +02:00
  • 0ad2e0815c debug output in WilsonHand nmeyer-ur 2020-04-10 11:56:29 +02:00
  • 1c8ca05e16 Merge branch 'feature/a64fx-2' of https://github.com/nmeyer-ur/Grid into feature/a64fx-2 nils meyer 2020-04-09 23:32:19 +02:00
  • dc9c8340bb switched to DSLASHINTRIN for A64FX Dslash intrinsics nils meyer 2020-04-09 23:30:23 +02:00
  • 19eef97503 specialized A64FX Dslash kernels nils meyer 2020-04-09 23:25:25 +02:00
  • 635246ce50 corrected typo nmeyer-ur 2020-04-09 21:42:50 +02:00
  • 5cdbb7e71e fixed A64FX Dslash; compiles, but does not specialize -> assertion nils meyer 2020-04-09 21:23:39 +02:00
  • 8123590a1b changes nmeyer-ur 2020-04-09 16:45:47 +02:00
  • 86c9c4da8b changes nmeyer-ur 2020-04-09 16:40:06 +02:00
  • cd1efee866 changes nmeyer-ur 2020-04-09 16:35:13 +02:00
  • bd310932f7 changes nmeyer-ur 2020-04-09 16:32:31 +02:00
  • 304762e7ac changes nmeyer-ur 2020-04-09 16:26:01 +02:00
  • d79ab03a6c changes nmeyer-ur 2020-04-09 16:19:25 +02:00
  • d5708e0eb2 more changes nmeyer-ur 2020-04-09 15:43:34 +02:00
  • 123f6b7a61 more changes nmeyer-ur 2020-04-09 15:17:19 +02:00
  • 2b6457dd9a added xp/xm recon accum nmeyer-ur 2020-04-09 15:13:19 +02:00
  • b367cbd422 defined ADD_RESULT nmeyer-ur 2020-04-09 15:08:45 +02:00
  • e252c1aca3 addressing nmeyer-ur 2020-04-09 15:03:12 +02:00
  • b140c6a4f9 addressing nmeyer-ur 2020-04-09 15:01:15 +02:00
  • 326de36467 revised sU addressing scheme nmeyer-ur 2020-04-09 14:44:25 +02:00
  • 9f224a1647 fixed typo in single nmeyer-ur 2020-04-09 14:30:21 +02:00
  • bb46ba9b5f fixed array size in single nmeyer-ur 2020-04-09 14:28:45 +02:00
  • dd5a22b36b revised declarations nmeyer-ur 2020-04-09 14:21:27 +02:00
  • 1ea85b9972 Disabled build message nmeyer-ur 2020-04-09 13:47:21 +02:00
  • 8fb63f1c25 added A64FX Wilson kernels single precision nmeyer-ur 2020-04-09 13:41:04 +02:00
  • 77fa586f6c introduced A64FX Wilson kernels nmeyer-ur 2020-04-09 13:30:06 +02:00
  • 96e8e44fd4 Merge pull request #2 from DanielRichtmann/feature/fused-innerproduct-norm2 Christoph Lehner 2020-04-06 13:16:58 +02:00
  • 5fc8a273e7 Fused innerProduct + norm2 on first argument operation Daniel Richtmann 2020-04-06 11:30:50 +02:00
  • d671a63e78 Update README.md Antonin Portelli 2020-04-03 19:52:15 +01:00
  • 15238e8d5e reduce acle works, clean up nmeyer-ur 2020-04-03 20:40:44 +02:00
  • b27e31957a reduce acle revised nmeyer-ur 2020-04-03 19:46:15 +02:00
  • 46927771e3 reduce acle still needs overhaul nmeyer-ur 2020-04-03 19:30:48 +02:00
  • d8cea77707 define simd width in header nmeyer-ur 2020-04-03 19:22:25 +02:00
  • 5f8a76d490 clean up, reduction in acle nmeyer-ur 2020-04-03 19:18:24 +02:00
  • 28d49a3b60 build problem resolved nmeyer-ur 2020-04-03 16:52:48 +02:00
  • b4c624ece6 added A64FX support nmeyer-ur 2020-04-03 15:43:23 +02:00
  • 2c22db841a Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions Henrique B.R 2020-04-02 17:38:47 +01:00
  • b89b1280d5 use gemm twice to complete the Gram Schmidt Yong-Chull Jang 2020-03-31 05:39:31 -04:00
  • ac7090e6d3 block Lanczos cublas buffer is set at the inital step; buffer width is fixed to the block size then cublas Zgemm is called multiple times Yong-Chull Jang 2020-03-30 22:25:50 -04:00
  • 02edbe624f first working version of Gram Schmidt using cublas gemm; explicit data type and site vector size has to be removed Yong-Chull Jang 2020-03-30 18:36:21 -04:00
  • 856d168e41 global sum over vectors of uint64_t Christoph Lehner 2020-03-29 07:56:05 -04:00
  • 6235c7ba98 IPP path fix in configure Antonin Portelli 2020-03-27 17:23:29 +00:00
  • 7e13724882 removing Hadrons Antonin Portelli 2020-03-27 12:03:32 +00:00
  • b6cbdd2aa3 Merge pull request #1 from DanielRichtmann/feature/read-openqcd Christoph Lehner 2020-03-26 17:39:04 +01:00
  • a2188ea875 remove debugging printf from WilsonKernelsImplementation Christoph Lehner 2020-03-26 09:12:36 -04:00
  • 9266b89ad8 fix rngs issue; block Lanczos is working Yong-Chull Jang 2020-03-25 15:45:50 -04:00
  • 989af65807 Check in parallel reader for openqcd configs Daniel Richtmann 2020-03-23 17:33:18 +01:00
  • 2db7e6f8ab merge manually Block Lanczos files from Chulwoo's update (last state = commit 731a05 + untracked files) to develop branch; namespace QCD is removed; FIXME: multiple starting vectors result in nan after initial orthogonalization Yong-Chull Jang 2020-03-24 01:03:24 -04:00
  • 60db3133d3 make trace,adj,transpose unary operators Christoph Lehner 2020-03-16 17:59:56 -04:00
  • c9b737a4e7 make trace,adj,transpose unary operators Christoph Lehner 2020-03-16 17:58:30 -04:00
  • 037bb6ea73 Check in reader for openqcd configs Daniel Richtmann 2020-03-16 14:07:52 +01:00
  • 05ebc458e2 Merge pull request #260 from mmphys/feature/distil Antonin Portelli 2020-03-13 14:00:21 +00:00
  • 3753508957 Making change 1) as simple as possible 2) as much like MSink/Point.hpp as possible Michael Marshall 2020-03-12 13:47:51 +00:00