bb0a0da47a
inon blocking caution due to SYCL
2022-08-02 08:09:43 -07:00
79e34b3eb4
Local Coherence batch deflation
2022-05-19 14:53:17 +01:00
b051e00de0
Additional Local Coherance Deflation operator()
2022-05-16 00:25:13 +01:00
a4ce6e42c7
Warning free compile on make all and make tests under nvcc
2021-10-27 00:27:03 +01:00
ba7e371b90
Warning free compile on Tursa.
...
Hopefully got all reqd virtual dtors
2021-10-21 19:56:52 +01:00
749b8022a4
Linear operator and SparseMatrix virtual destructors
2021-10-15 20:47:18 +01:00
6c66b8d997
deflated guesser can optionally be used with less vectors than provided
2021-09-30 19:25:12 +01:00
9523ad3d73
vector version of Schur solver use vector guesser
2021-09-28 12:45:47 +01:00
73a95fa96f
LinearFunction loops over vectors by default, can be overloaded
2021-09-28 12:44:26 +01:00
9d2238148c
Merge branch 'develop' of https://www.github.com/paboyle/Grid into develop
2021-09-15 19:25:57 +01:00
c15493218d
Two extra routines to break out SchurRedBlack on many RHS into stages to allow efficient deflation & split grid
...
Split grid solver still to do.
2021-09-15 19:24:39 +01:00
c50f27e68b
Make FFT play nice with split grid
2021-06-20 11:34:38 +02:00
2bb374daea
hip-friendly
2021-03-19 11:33:23 +01:00
281ac5fc12
Red black support on coars
2021-01-14 20:48:08 -05:00
4d2dc7ba03
Enable even-odd for CoarsenedMatrix
2020-09-11 20:32:02 +02:00
cf3535d16e
Expose more functions in CMat
2020-08-27 14:06:48 +02:00
b2087f14c4
Fix CoarsenedMatrix regarding illegal memory accesses
...
Need a reference to geom since the lambda copies the this pointer which points to host memory, see
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture
- https://devblogs.nvidia.com/new-compiler-features-cuda-8/
2020-08-24 17:46:47 +02:00
dd1ba266b2
Fix mapping between dir + disp and point in CMat
2020-08-24 17:46:46 +02:00
1292d59563
Add a typedef + broaden interface of CMat
2020-08-24 17:46:45 +02:00
c48da35921
Memory Vector UVM and Lattice alignedAllocator separate
2020-06-22 20:21:53 -04:00
1a74816c25
Hopeefully fixed
2020-06-19 17:50:52 -04:00
228fd450ce
Typo fix (excusee - my keyboard is starting to break)
2020-06-19 17:36:05 -04:00
b949cf6b12
PeekLocal needs a view to keep thread safe.
...
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
b5e87e8d97
summit compile fixes
2020-06-12 18:16:12 -04:00
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
1c9f20b15e
Views must be closed
2020-06-03 09:10:29 -04:00
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
82f71643a4
Remove the norm in MdagM
2020-05-12 17:55:53 -04:00
ea08f193e7
Allocator cache spliit into large/small pools
2020-05-10 05:24:26 -04:00
ab0c5d77fb
Correct NonHermitianSchurOperatorBase
2020-05-08 16:44:02 +02:00
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
1d65e2f62c
Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics
...
Lifteed from HDCR setup
2020-05-08 09:20:54 -04:00
21ca182c36
Comments remove
2020-05-08 09:18:24 -04:00
3c6ffcb48c
Merge branch 'develop' into feature/gpt
2020-05-06 15:03:35 +02:00
e9b295f967
Synchronize blocking infrastructure with GPT
2020-05-06 08:42:28 -04:00
9b2d2d0fc3
Basis rotate stack passig to GPU reduction
2020-04-30 12:31:07 -04:00
0896f2cead
Added missing include guards in bigfloat_double.h
2020-04-20 10:30:38 -04:00
181709bba4
Merge branch 'develop' into feature/zmobius_paramcompute
2020-04-20 09:12:34 -04:00
90229cfb0f
Merge pull request #270 from milc-qcd/feature/CGinfo
...
feature/CGinfo
2020-04-16 11:46:08 -04:00
0475c46ecb
Merge pull request #256 from djm2131/feature/BiCGSTAB
...
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
11dec4883c
Don't throw assert
2020-04-10 11:09:11 -04:00
afa458c812
Extra solvers
2020-04-10 11:08:19 -04:00
dc50190b8f
Faster GPU basis rotation
...
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
165c68e28e
Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift
2020-02-29 17:51:51 -06:00
9479bc8486
Make IterationsToComplete and TrueResidual externally accessible
2020-02-19 17:43:57 -06:00
8a5c13d5fb
Still fast moving in changes
2020-02-06 17:57:26 -05:00
bdccb0c91f
Working 2 types of decomposition
2020-02-06 17:26:55 -05:00
b9ca40cc44
More precise power method at start
2020-02-06 10:09:14 -05:00
2f421a5db1
Commeent fix
2020-02-06 10:08:27 -05:00