1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-18 07:47:06 +01:00
Commit Graph

113 Commits

Author SHA1 Message Date
bb0a0da47a inon blocking caution due to SYCL 2022-08-02 08:09:43 -07:00
79e34b3eb4 Local Coherence batch deflation 2022-05-19 14:53:17 +01:00
b051e00de0 Additional Local Coherance Deflation operator() 2022-05-16 00:25:13 +01:00
a4ce6e42c7 Warning free compile on make all and make tests under nvcc 2021-10-27 00:27:03 +01:00
ba7e371b90 Warning free compile on Tursa.
Hopefully got all reqd virtual dtors
2021-10-21 19:56:52 +01:00
749b8022a4 Linear operator and SparseMatrix virtual destructors 2021-10-15 20:47:18 +01:00
6c66b8d997 deflated guesser can optionally be used with less vectors than provided 2021-09-30 19:25:12 +01:00
9523ad3d73 vector version of Schur solver use vector guesser 2021-09-28 12:45:47 +01:00
73a95fa96f LinearFunction loops over vectors by default, can be overloaded 2021-09-28 12:44:26 +01:00
9d2238148c Merge branch 'develop' of https://www.github.com/paboyle/Grid into develop 2021-09-15 19:25:57 +01:00
c15493218d Two extra routines to break out SchurRedBlack on many RHS into stages to allow efficient deflation & split grid
Split grid solver still to do.
2021-09-15 19:24:39 +01:00
c50f27e68b Make FFT play nice with split grid 2021-06-20 11:34:38 +02:00
2bb374daea hip-friendly 2021-03-19 11:33:23 +01:00
281ac5fc12 Red black support on coars 2021-01-14 20:48:08 -05:00
4d2dc7ba03 Enable even-odd for CoarsenedMatrix 2020-09-11 20:32:02 +02:00
cf3535d16e Expose more functions in CMat 2020-08-27 14:06:48 +02:00
b2087f14c4 Fix CoarsenedMatrix regarding illegal memory accesses
Need a reference to geom since the lambda copies the this pointer which points to host memory, see
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture
- https://devblogs.nvidia.com/new-compiler-features-cuda-8/
2020-08-24 17:46:47 +02:00
dd1ba266b2 Fix mapping between dir + disp and point in CMat 2020-08-24 17:46:46 +02:00
1292d59563 Add a typedef + broaden interface of CMat 2020-08-24 17:46:45 +02:00
c48da35921 Memory Vector UVM and Lattice alignedAllocator separate 2020-06-22 20:21:53 -04:00
1a74816c25 Hopeefully fixed 2020-06-19 17:50:52 -04:00
228fd450ce Typo fix (excusee - my keyboard is starting to break) 2020-06-19 17:36:05 -04:00
b949cf6b12 PeekLocal needs a view to keep thread safe.
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
b5e87e8d97 summit compile fixes 2020-06-12 18:16:12 -04:00
cdf0a04fc5 Merge branch 'develop' into sycl 2020-06-09 04:00:12 -04:00
1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
1c9f20b15e Views must be closed 2020-06-03 09:10:29 -04:00
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
82f71643a4 Remove the norm in MdagM 2020-05-12 17:55:53 -04:00
ea08f193e7 Allocator cache spliit into large/small pools 2020-05-10 05:24:26 -04:00
ab0c5d77fb Correct NonHermitianSchurOperatorBase 2020-05-08 16:44:02 +02:00
f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
Aim to reduce the amount of cuda and other code variations floating around all over the place.

Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
1d65e2f62c Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics
Lifteed from HDCR setup
2020-05-08 09:20:54 -04:00
21ca182c36 Comments remove 2020-05-08 09:18:24 -04:00
3c6ffcb48c Merge branch 'develop' into feature/gpt 2020-05-06 15:03:35 +02:00
e9b295f967 Synchronize blocking infrastructure with GPT 2020-05-06 08:42:28 -04:00
9b2d2d0fc3 Basis rotate stack passig to GPU reduction 2020-04-30 12:31:07 -04:00
0896f2cead Added missing include guards in bigfloat_double.h 2020-04-20 10:30:38 -04:00
181709bba4 Merge branch 'develop' into feature/zmobius_paramcompute 2020-04-20 09:12:34 -04:00
90229cfb0f Merge pull request #270 from milc-qcd/feature/CGinfo
feature/CGinfo
2020-04-16 11:46:08 -04:00
0475c46ecb Merge pull request #256 from djm2131/feature/BiCGSTAB
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
11dec4883c Don't throw assert 2020-04-10 11:09:11 -04:00
afa458c812 Extra solvers 2020-04-10 11:08:19 -04:00
dc50190b8f Faster GPU basis rotation
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
165c68e28e Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift 2020-02-29 17:51:51 -06:00
9479bc8486 Make IterationsToComplete and TrueResidual externally accessible 2020-02-19 17:43:57 -06:00
8a5c13d5fb Still fast moving in changes 2020-02-06 17:57:26 -05:00
bdccb0c91f Working 2 types of decomposition 2020-02-06 17:26:55 -05:00
b9ca40cc44 More precise power method at start 2020-02-06 10:09:14 -05:00
2f421a5db1 Commeent fix 2020-02-06 10:08:27 -05:00