Peter Boyle
bb0a0da47a
inon blocking caution due to SYCL
2022-08-02 08:09:43 -07:00
79e34b3eb4
Local Coherence batch deflation
2022-05-19 14:53:17 +01:00
b051e00de0
Additional Local Coherance Deflation operator()
2022-05-16 00:25:13 +01:00
Peter Boyle
a4ce6e42c7
Warning free compile on make all and make tests under nvcc
2021-10-27 00:27:03 +01:00
Peter Boyle
ba7e371b90
Warning free compile on Tursa.
...
Hopefully got all reqd virtual dtors
2021-10-21 19:56:52 +01:00
Peter Boyle
749b8022a4
Linear operator and SparseMatrix virtual destructors
2021-10-15 20:47:18 +01:00
6c66b8d997
deflated guesser can optionally be used with less vectors than provided
2021-09-30 19:25:12 +01:00
9523ad3d73
vector version of Schur solver use vector guesser
2021-09-28 12:45:47 +01:00
73a95fa96f
LinearFunction loops over vectors by default, can be overloaded
2021-09-28 12:44:26 +01:00
Peter Boyle
9d2238148c
Merge branch 'develop' of https://www.github.com/paboyle/Grid into develop
2021-09-15 19:25:57 +01:00
Peter Boyle
c15493218d
Two extra routines to break out SchurRedBlack on many RHS into stages to allow efficient deflation & split grid
...
Split grid solver still to do.
2021-09-15 19:24:39 +01:00
Christoph Lehner
c50f27e68b
Make FFT play nice with split grid
2021-06-20 11:34:38 +02:00
Christoph Lehner
2bb374daea
hip-friendly
2021-03-19 11:33:23 +01:00
Peter Boyle
281ac5fc12
Red black support on coars
2021-01-14 20:48:08 -05:00
Daniel Richtmann
4d2dc7ba03
Enable even-odd for CoarsenedMatrix
2020-09-11 20:32:02 +02:00
Daniel Richtmann
cf3535d16e
Expose more functions in CMat
2020-08-27 14:06:48 +02:00
Daniel Richtmann
b2087f14c4
Fix CoarsenedMatrix regarding illegal memory accesses
...
Need a reference to geom since the lambda copies the this pointer which points to host memory, see
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture
- https://devblogs.nvidia.com/new-compiler-features-cuda-8/
2020-08-24 17:46:47 +02:00
Daniel Richtmann
dd1ba266b2
Fix mapping between dir + disp and point in CMat
2020-08-24 17:46:46 +02:00
Daniel Richtmann
1292d59563
Add a typedef + broaden interface of CMat
2020-08-24 17:46:45 +02:00
Peter Boyle
c48da35921
Memory Vector UVM and Lattice alignedAllocator separate
2020-06-22 20:21:53 -04:00
Peter Boyle
1a74816c25
Hopeefully fixed
2020-06-19 17:50:52 -04:00
Peter Boyle
228fd450ce
Typo fix (excusee - my keyboard is starting to break)
2020-06-19 17:36:05 -04:00
Peter Boyle
b949cf6b12
PeekLocal needs a view to keep thread safe.
...
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
Christoph Lehner
b5e87e8d97
summit compile fixes
2020-06-12 18:16:12 -04:00
Peter Boyle
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
Peter Boyle
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
Peter Boyle
1c9f20b15e
Views must be closed
2020-06-03 09:10:29 -04:00
Peter Boyle
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle
82f71643a4
Remove the norm in MdagM
2020-05-12 17:55:53 -04:00
Peter Boyle
ea08f193e7
Allocator cache spliit into large/small pools
2020-05-10 05:24:26 -04:00
Daniel Richtmann
ab0c5d77fb
Correct NonHermitianSchurOperatorBase
2020-05-08 16:44:02 +02:00
Peter Boyle
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Peter Boyle
1d65e2f62c
Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics
...
Lifteed from HDCR setup
2020-05-08 09:20:54 -04:00
Peter Boyle
21ca182c36
Comments remove
2020-05-08 09:18:24 -04:00
Christoph Lehner
3c6ffcb48c
Merge branch 'develop' into feature/gpt
2020-05-06 15:03:35 +02:00
Christoph Lehner
e9b295f967
Synchronize blocking infrastructure with GPT
2020-05-06 08:42:28 -04:00
Peter Boyle
9b2d2d0fc3
Basis rotate stack passig to GPU reduction
2020-04-30 12:31:07 -04:00
Christopher Kelly
0896f2cead
Added missing include guards in bigfloat_double.h
2020-04-20 10:30:38 -04:00
Christopher Kelly
181709bba4
Merge branch 'develop' into feature/zmobius_paramcompute
2020-04-20 09:12:34 -04:00
Peter Boyle
90229cfb0f
Merge pull request #270 from milc-qcd/feature/CGinfo
...
feature/CGinfo
2020-04-16 11:46:08 -04:00
Peter Boyle
0475c46ecb
Merge pull request #256 from djm2131/feature/BiCGSTAB
...
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
Peter Boyle
11dec4883c
Don't throw assert
2020-04-10 11:09:11 -04:00
Peter Boyle
afa458c812
Extra solvers
2020-04-10 11:08:19 -04:00
Peter Boyle
dc50190b8f
Faster GPU basis rotation
...
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
Carleton DeTar
165c68e28e
Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift
2020-02-29 17:51:51 -06:00
Carleton DeTar
9479bc8486
Make IterationsToComplete and TrueResidual externally accessible
2020-02-19 17:43:57 -06:00
Peter Boyle
8a5c13d5fb
Still fast moving in changes
2020-02-06 17:57:26 -05:00
Peter Boyle
bdccb0c91f
Working 2 types of decomposition
2020-02-06 17:26:55 -05:00
Peter Boyle
b9ca40cc44
More precise power method at start
2020-02-06 10:09:14 -05:00
Peter Boyle
2f421a5db1
Commeent fix
2020-02-06 10:08:27 -05:00