1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-08-17 03:31:54 +01:00

Commit Graph

  • 77fa586f6c introduced A64FX Wilson kernels nmeyer-ur 2020-04-09 13:30:06 +02:00
  • 96e8e44fd4 Merge pull request #2 from DanielRichtmann/feature/fused-innerproduct-norm2 Christoph Lehner 2020-04-06 13:16:58 +02:00
  • 5fc8a273e7 Fused innerProduct + norm2 on first argument operation Daniel Richtmann 2020-04-06 11:30:50 +02:00
  • d671a63e78 Update README.md Antonin Portelli 2020-04-03 19:52:15 +01:00
  • 15238e8d5e reduce acle works, clean up nmeyer-ur 2020-04-03 20:40:44 +02:00
  • b27e31957a reduce acle revised nmeyer-ur 2020-04-03 19:46:15 +02:00
  • 46927771e3 reduce acle still needs overhaul nmeyer-ur 2020-04-03 19:30:48 +02:00
  • d8cea77707 define simd width in header nmeyer-ur 2020-04-03 19:22:25 +02:00
  • 5f8a76d490 clean up, reduction in acle nmeyer-ur 2020-04-03 19:18:24 +02:00
  • 28d49a3b60 build problem resolved nmeyer-ur 2020-04-03 16:52:48 +02:00
  • b4c624ece6 added A64FX support nmeyer-ur 2020-04-03 15:43:23 +02:00
  • 2c22db841a Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions Henrique B.R 2020-04-02 17:38:47 +01:00
  • b89b1280d5 use gemm twice to complete the Gram Schmidt Yong-Chull Jang 2020-03-31 05:39:31 -04:00
  • ac7090e6d3 block Lanczos cublas buffer is set at the inital step; buffer width is fixed to the block size then cublas Zgemm is called multiple times Yong-Chull Jang 2020-03-30 22:25:50 -04:00
  • 02edbe624f first working version of Gram Schmidt using cublas gemm; explicit data type and site vector size has to be removed Yong-Chull Jang 2020-03-30 18:36:21 -04:00
  • 856d168e41 global sum over vectors of uint64_t Christoph Lehner 2020-03-29 07:56:05 -04:00
  • 6235c7ba98 IPP path fix in configure Antonin Portelli 2020-03-27 17:23:29 +00:00
  • 7e13724882 removing Hadrons Antonin Portelli 2020-03-27 12:03:32 +00:00
  • b6cbdd2aa3 Merge pull request #1 from DanielRichtmann/feature/read-openqcd Christoph Lehner 2020-03-26 17:39:04 +01:00
  • a2188ea875 remove debugging printf from WilsonKernelsImplementation Christoph Lehner 2020-03-26 09:12:36 -04:00
  • 9266b89ad8 fix rngs issue; block Lanczos is working Yong-Chull Jang 2020-03-25 15:45:50 -04:00
  • 989af65807 Check in parallel reader for openqcd configs Daniel Richtmann 2020-03-23 17:33:18 +01:00
  • 2db7e6f8ab merge manually Block Lanczos files from Chulwoo's update (last state = commit 731a05 + untracked files) to develop branch; namespace QCD is removed; FIXME: multiple starting vectors result in nan after initial orthogonalization Yong-Chull Jang 2020-03-24 01:03:24 -04:00
  • 60db3133d3 make trace,adj,transpose unary operators Christoph Lehner 2020-03-16 17:59:56 -04:00
  • c9b737a4e7 make trace,adj,transpose unary operators Christoph Lehner 2020-03-16 17:58:30 -04:00
  • 037bb6ea73 Check in reader for openqcd configs Daniel Richtmann 2020-03-16 14:07:52 +01:00
  • 05ebc458e2 Merge pull request #260 from mmphys/feature/distil Antonin Portelli 2020-03-13 14:00:21 +00:00
  • 3753508957 Making change 1) as simple as possible 2) as much like MSink/Point.hpp as possible Michael Marshall 2020-03-12 13:47:51 +00:00
  • c1677fccf6 Merge branch 'develop' into feature/distil Michael Marshall 2020-03-12 12:45:18 +00:00
  • 35e8e31749 Merge pull request #272 from mmphys/feature/ZPeramb Antonin Portelli 2020-03-12 12:28:04 +00:00
  • 34813e9b04 Merge branch 'develop' into feature/ZPeramb Antonin Portelli 2020-03-12 12:27:56 +00:00
  • 373cf61abb bugfix ZPerambulator Felix Erben 2020-03-12 11:44:43 +00:00
  • 4e8fbc4b49 Merge pull request #271 from mmphys/feature/ZDistil Antonin Portelli 2020-03-12 10:54:07 +00:00
  • 516ac1d4d5 registered module supporting ZMobius action ferben 2020-03-12 10:52:27 +00:00
  • 318f63eb34 Merge pull request #268 from mmphys/a2a-error-log Antonin Portelli 2020-03-11 11:09:00 +00:00
  • 16503d7532 Merge pull request #267 from mmphys/feature/distil-bugfix Antonin Portelli 2020-03-11 11:08:23 +00:00
  • 0fa93383b7 changed to push_back according to request ferben 2020-03-11 09:05:01 +00:00
  • 0a827aa7bf Added Hadrons_Error in case blockSize is set too large ferben 2020-03-11 08:52:52 +00:00
  • 165c68e28e Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift Carleton DeTar 2020-02-29 17:51:51 -06:00
  • b32b1ca642 bugfix in perambulator module ferben 2020-02-26 12:06:45 +00:00
  • 9479bc8486 Make IterationsToComplete and TrueResidual externally accessible Carleton DeTar 2020-02-19 17:43:57 -06:00
  • 8a5c13d5fb Still fast moving in changes Peter Boyle 2020-02-06 17:57:26 -05:00
  • bdccb0c91f Working 2 types of decomposition Peter Boyle 2020-02-06 17:26:55 -05:00
  • 68b45f6444 Lower left/upper right region cut paste Peter Boyle 2020-02-06 15:50:26 -05:00
  • ef9b3e658a extra typedef Peter Boyle 2020-02-06 15:47:14 -05:00
  • b9ca40cc44 More precise power method at start Peter Boyle 2020-02-06 10:09:14 -05:00
  • 2f421a5db1 Commeent fix Peter Boyle 2020-02-06 10:08:27 -05:00
  • 10192dfc71 Wall source momenta must be specified for spatial components only. So we don't break existing scripts, allow momentum in time direction as well, but only if zero. Fail early, so do the check in setup() Michael Marshall 2020-01-31 15:02:03 +00:00
  • c69a3b6ef6 When saving eigenvectors, LapEvec now saves eigenvalues for every timeslice as well. I.e. nT x nVec eigenvalues are saved in FileName.evals.conf.h5. A new named tensor, "TimesliceEvals" can be used to simplify restoring these from disk. NB: The changes in BaseIO add support so that Eigen tensors can be easily used in MPI operations, e.g. GlobalSum. See LapEvec.hpp for an example of how this is done. Michael Marshall 2020-01-29 21:20:20 +00:00
  • 852fc1b001 True Hierachical multigrid for DWF Peter Boyle 2020-01-27 13:45:10 -05:00
  • 2b5de5bba5 MdagM operator without norm option Peter Boyle 2020-01-27 13:44:30 -05:00
  • 2e85cae74e Add Jacobi polynomials Peter Boyle 2020-01-27 13:43:49 -05:00
  • 76c823781e Much faster coarsening Peter Boyle 2020-01-27 13:43:19 -05:00
  • 114db3b99d Optional MdagM without norms Peter Boyle 2020-01-27 13:42:51 -05:00
  • 49e123dbda Use explicit linalg calls to get coalesce optimisations on GPU Peter Boyle 2020-01-27 12:44:51 -05:00
  • 8cec294ec9 Make CG a bit less verbose as gettign annoying in nested algorithms. Can use Iterative logging if you want to see more Peter Boyle 2020-01-27 12:44:04 -05:00
  • eb5b720e94 Normal Equations can be used in HDCR now Peter Boyle 2020-01-27 12:43:29 -05:00
  • b2736ec80b Make PrecGCR recursive - it can precondition itself Peter Boyle 2020-01-27 12:42:48 -05:00
  • 086256a032 Less sloppy convergence test on PowerMethod Peter Boyle 2020-01-27 12:41:59 -05:00
  • afc7426f39 Much bigger pointer cache in case of Nvidia due to cost of setting up UVM allocations Peter Boyle 2020-01-27 12:41:16 -05:00
  • 7c061e20c9 All directions of dirac operator for fastt coarsening Peter Boyle 2020-01-27 12:40:13 -05:00
  • e5d1c09665 Faster DhopDirAll for little dirac operator coarsening Peter Boyle 2020-01-27 12:38:54 -05:00
  • 8016a465ae Remove extraneous variable Peter Boyle 2020-01-27 12:35:37 -05:00
  • d8b9742092 DhopDirAll for faster matrix elements of little Dirac operator Peter Boyle 2020-01-27 12:34:54 -05:00
  • 1bd87c35d7 Read coalescing on Nvidia Peter Boyle 2020-01-27 12:29:56 -05:00
  • fa856c9669 Disable information message Peter Boyle 2020-01-27 12:28:46 -05:00
  • 48008e4d8b Thread coordinate creation loop Peter Boyle 2020-01-27 12:28:16 -05:00
  • 55cdb17691 Integer divide for blocking Peter Boyle 2020-01-27 12:27:45 -05:00
  • 2ed39ebb7a Perambulator won't even allocate memory for unsmeared sinks unless the filename is specified. Prior to this update, memory is allocated regardless of whether these are requested. Michael Marshall 2020-01-24 13:01:06 +00:00
  • 96671bbb24 Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials Added implementation of computation of ZMobius parameters Added Test_zMADWF_prec to test ZMobius in MADWF Christopher Kelly 2020-01-17 12:45:30 -08:00
  • 554542b773 Merge branch 'feature/hdcr' of https://github.com/paboyle/Grid into feature/hdcr Peter Boyle 2020-01-06 11:47:56 -05:00
  • 03da4040e2 Make summit happy Peter Boyle 2020-01-06 11:47:48 -05:00
  • e583035614 Change to interface to minise comms in evaluating coarse space operator Peter Boyle 2020-01-06 11:43:59 -05:00
  • 3c3d6a94f3 OPtimising the force term a bit Peter Boyle 2020-01-04 03:16:23 -05:00
  • 205ea4bbb2 More verboose Lanczos Peter Boyle 2020-01-04 03:13:40 -05:00
  • 039eb7b2eb Make the force term and coarsening multigrid more optimised Peter Boyle 2020-01-04 03:12:17 -05:00
  • f7e4bd1f6d Getting more optimised Peter Boyle 2020-01-04 03:11:53 -05:00
  • 0afecfcae7 Nearing well optimised state Peter Boyle 2020-01-04 03:11:19 -05:00
  • ba40a3f763 Alternate low pass filter option Peter Boyle 2020-01-03 05:29:09 -05:00
  • aa920aa532 Improved DWF multigrid Peter Boyle 2019-12-28 10:32:35 -05:00
  • c0d8e4dce5 Improved Multigrid for DWF Peter Boyle 2019-12-28 10:32:15 -05:00
  • 0ca1992151 Remove warning in tensor layout comparison. Make default names and index names visible for PerambTensor and NoiseTensor Michael Marshall 2019-12-20 13:53:27 +00:00
  • df2b0c4e79 Merge branch 'develop' into feature/distil Michael Marshall 2019-12-20 13:24:59 +00:00
  • 9cfd64c604 Coarse grid on GPU, not fast enough yet. Need a 10x Peter Boyle 2019-12-17 05:24:45 -05:00
  • e478404291 Tuned up significantly on GPU, but another 10x in coarse space required Peter Boyle 2019-12-17 05:03:25 -05:00
  • 9aafd20468 Simple block project promote runs faster on GPU Peter Boyle 2019-12-17 05:01:39 -05:00
  • 5d834486c9 Merge pull request #259 from grid-test-organisation/feature/5d-improvement-fix Peter Boyle 2019-12-16 04:20:37 -05:00
  • f7373e97a4 Missing conjugate in MooeeInvDag gfilaci 2019-12-16 10:04:44 +01:00
  • 9e15474999 Accelerator loop attempt at speed up Peter Boyle 2019-12-14 05:28:16 -05:00
  • 152b525a4d Typo fix Peter Boyle 2019-12-13 22:44:42 -05:00
  • d18994eddc offload more of mgrid to GPU Peter Boyle 2019-12-13 22:08:11 -05:00
  • b8bd8cd2ae Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2019-12-13 21:32:10 -05:00
  • 736b19485e Faster set up and some dead code ifdef'ed out Peter Boyle 2019-12-13 21:30:48 -05:00
  • c7637a84ad Documentation tweak for peculiarities of OpenMPI --prefix Michael Marshall 2019-12-12 17:00:03 +00:00
  • a7772c827b Documentation tweak Michael Marshall 2019-12-12 16:05:22 +00:00
  • 8e83398861 Merge pull request #257 from AndrewYongZhenNing/develop Antonin Portelli 2019-12-11 21:36:59 +00:00
  • 843ca9350a Fix naming conventions to be consistent with Peter David Murphy 2019-12-11 11:46:18 -05:00
  • f47b2b6e13 Added NamedTensor.hpp Andrew Zhen Ning Yong 2019-12-11 15:56:46 +00:00
  • 5bfd1470ad Merge branch 'develop' into feature/hdcr Peter Boyle 2019-12-10 21:51:06 -05:00
  • 6957b0b58a Merge branch 'develop' of https://github.com/paboyle/Grid into develop Peter Boyle 2019-12-10 21:50:42 -05:00