4d2dc7ba03
Enable even-odd for CoarsenedMatrix
2020-09-11 20:32:02 +02:00
cf3535d16e
Expose more functions in CMat
2020-08-27 14:06:48 +02:00
b2087f14c4
Fix CoarsenedMatrix regarding illegal memory accesses
...
Need a reference to geom since the lambda copies the this pointer which points to host memory, see
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture
- https://devblogs.nvidia.com/new-compiler-features-cuda-8/
2020-08-24 17:46:47 +02:00
dd1ba266b2
Fix mapping between dir + disp and point in CMat
2020-08-24 17:46:46 +02:00
1292d59563
Add a typedef + broaden interface of CMat
2020-08-24 17:46:45 +02:00
c48da35921
Memory Vector UVM and Lattice alignedAllocator separate
2020-06-22 20:21:53 -04:00
1a74816c25
Hopeefully fixed
2020-06-19 17:50:52 -04:00
228fd450ce
Typo fix (excusee - my keyboard is starting to break)
2020-06-19 17:36:05 -04:00
b949cf6b12
PeekLocal needs a view to keep thread safe.
...
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
b5e87e8d97
summit compile fixes
2020-06-12 18:16:12 -04:00
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
1c9f20b15e
Views must be closed
2020-06-03 09:10:29 -04:00
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
82f71643a4
Remove the norm in MdagM
2020-05-12 17:55:53 -04:00
ea08f193e7
Allocator cache spliit into large/small pools
2020-05-10 05:24:26 -04:00
ab0c5d77fb
Correct NonHermitianSchurOperatorBase
2020-05-08 16:44:02 +02:00
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
1d65e2f62c
Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics
...
Lifteed from HDCR setup
2020-05-08 09:20:54 -04:00
21ca182c36
Comments remove
2020-05-08 09:18:24 -04:00
3c6ffcb48c
Merge branch 'develop' into feature/gpt
2020-05-06 15:03:35 +02:00
e9b295f967
Synchronize blocking infrastructure with GPT
2020-05-06 08:42:28 -04:00
9b2d2d0fc3
Basis rotate stack passig to GPU reduction
2020-04-30 12:31:07 -04:00
0896f2cead
Added missing include guards in bigfloat_double.h
2020-04-20 10:30:38 -04:00
181709bba4
Merge branch 'develop' into feature/zmobius_paramcompute
2020-04-20 09:12:34 -04:00
90229cfb0f
Merge pull request #270 from milc-qcd/feature/CGinfo
...
feature/CGinfo
2020-04-16 11:46:08 -04:00
0475c46ecb
Merge pull request #256 from djm2131/feature/BiCGSTAB
...
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
11dec4883c
Don't throw assert
2020-04-10 11:09:11 -04:00
afa458c812
Extra solvers
2020-04-10 11:08:19 -04:00
dc50190b8f
Faster GPU basis rotation
...
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
165c68e28e
Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift
2020-02-29 17:51:51 -06:00
9479bc8486
Make IterationsToComplete and TrueResidual externally accessible
2020-02-19 17:43:57 -06:00
8a5c13d5fb
Still fast moving in changes
2020-02-06 17:57:26 -05:00
bdccb0c91f
Working 2 types of decomposition
2020-02-06 17:26:55 -05:00
b9ca40cc44
More precise power method at start
2020-02-06 10:09:14 -05:00
2f421a5db1
Commeent fix
2020-02-06 10:08:27 -05:00
2b5de5bba5
MdagM operator without norm option
2020-01-27 13:44:30 -05:00
2e85cae74e
Add Jacobi polynomials
2020-01-27 13:43:49 -05:00
76c823781e
Much faster coarsening
2020-01-27 13:43:19 -05:00
114db3b99d
Optional MdagM without norms
2020-01-27 13:42:51 -05:00
49e123dbda
Use explicit linalg calls to get coalesce optimisations on GPU
2020-01-27 12:44:51 -05:00
8cec294ec9
Make CG a bit less verbose as gettign annoying in nested algorithms.
...
Can use Iterative logging if you want to see more
2020-01-27 12:44:04 -05:00
eb5b720e94
Normal Equations can be used in HDCR now
2020-01-27 12:43:29 -05:00
b2736ec80b
Make PrecGCR recursive - it can precondition itself
2020-01-27 12:42:48 -05:00
086256a032
Less sloppy convergence test on PowerMethod
2020-01-27 12:41:59 -05:00
96671bbb24
Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
...
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
e583035614
Change to interface to minise comms in evaluating coarse space operator
2020-01-06 11:43:59 -05:00
205ea4bbb2
More verboose Lanczos
2020-01-04 03:13:40 -05:00
f7e4bd1f6d
Getting more optimised
2020-01-04 03:11:53 -05:00
ba40a3f763
Alternate low pass filter option
2020-01-03 05:29:09 -05:00