1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-22 17:52:02 +01:00
Commit Graph

149 Commits

Author SHA1 Message Date
4d2dc7ba03 Enable even-odd for CoarsenedMatrix 2020-09-11 20:32:02 +02:00
cf3535d16e Expose more functions in CMat 2020-08-27 14:06:48 +02:00
b2087f14c4 Fix CoarsenedMatrix regarding illegal memory accesses
Need a reference to geom since the lambda copies the this pointer which points to host memory, see
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture
- https://devblogs.nvidia.com/new-compiler-features-cuda-8/
2020-08-24 17:46:47 +02:00
dd1ba266b2 Fix mapping between dir + disp and point in CMat 2020-08-24 17:46:46 +02:00
1292d59563 Add a typedef + broaden interface of CMat 2020-08-24 17:46:45 +02:00
c48da35921 Memory Vector UVM and Lattice alignedAllocator separate 2020-06-22 20:21:53 -04:00
1a74816c25 Hopeefully fixed 2020-06-19 17:50:52 -04:00
228fd450ce Typo fix (excusee - my keyboard is starting to break) 2020-06-19 17:36:05 -04:00
b949cf6b12 PeekLocal needs a view to keep thread safe.
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
b5e87e8d97 summit compile fixes 2020-06-12 18:16:12 -04:00
cdf0a04fc5 Merge branch 'develop' into sycl 2020-06-09 04:00:12 -04:00
1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
1c9f20b15e Views must be closed 2020-06-03 09:10:29 -04:00
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
82f71643a4 Remove the norm in MdagM 2020-05-12 17:55:53 -04:00
ea08f193e7 Allocator cache spliit into large/small pools 2020-05-10 05:24:26 -04:00
ab0c5d77fb Correct NonHermitianSchurOperatorBase 2020-05-08 16:44:02 +02:00
f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
Aim to reduce the amount of cuda and other code variations floating around all over the place.

Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
1d65e2f62c Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics
Lifteed from HDCR setup
2020-05-08 09:20:54 -04:00
21ca182c36 Comments remove 2020-05-08 09:18:24 -04:00
3c6ffcb48c Merge branch 'develop' into feature/gpt 2020-05-06 15:03:35 +02:00
e9b295f967 Synchronize blocking infrastructure with GPT 2020-05-06 08:42:28 -04:00
9b2d2d0fc3 Basis rotate stack passig to GPU reduction 2020-04-30 12:31:07 -04:00
0896f2cead Added missing include guards in bigfloat_double.h 2020-04-20 10:30:38 -04:00
181709bba4 Merge branch 'develop' into feature/zmobius_paramcompute 2020-04-20 09:12:34 -04:00
90229cfb0f Merge pull request #270 from milc-qcd/feature/CGinfo
feature/CGinfo
2020-04-16 11:46:08 -04:00
0475c46ecb Merge pull request #256 from djm2131/feature/BiCGSTAB
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
11dec4883c Don't throw assert 2020-04-10 11:09:11 -04:00
afa458c812 Extra solvers 2020-04-10 11:08:19 -04:00
dc50190b8f Faster GPU basis rotation
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
165c68e28e Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift 2020-02-29 17:51:51 -06:00
9479bc8486 Make IterationsToComplete and TrueResidual externally accessible 2020-02-19 17:43:57 -06:00
8a5c13d5fb Still fast moving in changes 2020-02-06 17:57:26 -05:00
bdccb0c91f Working 2 types of decomposition 2020-02-06 17:26:55 -05:00
b9ca40cc44 More precise power method at start 2020-02-06 10:09:14 -05:00
2f421a5db1 Commeent fix 2020-02-06 10:08:27 -05:00
2b5de5bba5 MdagM operator without norm option 2020-01-27 13:44:30 -05:00
2e85cae74e Add Jacobi polynomials 2020-01-27 13:43:49 -05:00
76c823781e Much faster coarsening 2020-01-27 13:43:19 -05:00
114db3b99d Optional MdagM without norms 2020-01-27 13:42:51 -05:00
49e123dbda Use explicit linalg calls to get coalesce optimisations on GPU 2020-01-27 12:44:51 -05:00
8cec294ec9 Make CG a bit less verbose as gettign annoying in nested algorithms.
Can use Iterative logging if you want to see more
2020-01-27 12:44:04 -05:00
eb5b720e94 Normal Equations can be used in HDCR now 2020-01-27 12:43:29 -05:00
b2736ec80b Make PrecGCR recursive - it can precondition itself 2020-01-27 12:42:48 -05:00
086256a032 Less sloppy convergence test on PowerMethod 2020-01-27 12:41:59 -05:00
96671bbb24 Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
e583035614 Change to interface to minise comms in evaluating coarse space operator 2020-01-06 11:43:59 -05:00
205ea4bbb2 More verboose Lanczos 2020-01-04 03:13:40 -05:00
f7e4bd1f6d Getting more optimised 2020-01-04 03:11:53 -05:00
ba40a3f763 Alternate low pass filter option 2020-01-03 05:29:09 -05:00