Peter Boyle
49e123dbda
Use explicit linalg calls to get coalesce optimisations on GPU
2020-01-27 12:44:51 -05:00
Peter Boyle
8cec294ec9
Make CG a bit less verbose as gettign annoying in nested algorithms.
...
Can use Iterative logging if you want to see more
2020-01-27 12:44:04 -05:00
Peter Boyle
eb5b720e94
Normal Equations can be used in HDCR now
2020-01-27 12:43:29 -05:00
Peter Boyle
b2736ec80b
Make PrecGCR recursive - it can precondition itself
2020-01-27 12:42:48 -05:00
Peter Boyle
086256a032
Less sloppy convergence test on PowerMethod
2020-01-27 12:41:59 -05:00
Peter Boyle
afc7426f39
Much bigger pointer cache in case of Nvidia due to cost of setting up UVM allocations
2020-01-27 12:41:16 -05:00
Peter Boyle
7c061e20c9
All directions of dirac operator for fastt coarsening
2020-01-27 12:40:13 -05:00
Peter Boyle
e5d1c09665
Faster DhopDirAll for little dirac operator coarsening
2020-01-27 12:38:54 -05:00
Peter Boyle
8016a465ae
Remove extraneous variable
2020-01-27 12:35:37 -05:00
Peter Boyle
d8b9742092
DhopDirAll for faster matrix elements of little Dirac operator
2020-01-27 12:34:54 -05:00
Peter Boyle
1bd87c35d7
Read coalescing on Nvidia
2020-01-27 12:29:56 -05:00
Peter Boyle
fa856c9669
Disable information message
2020-01-27 12:28:46 -05:00
Peter Boyle
48008e4d8b
Thread coordinate creation loop
2020-01-27 12:28:16 -05:00
Peter Boyle
55cdb17691
Integer divide for blocking
2020-01-27 12:27:45 -05:00
Michael Marshall
2ed39ebb7a
Perambulator won't even allocate memory for unsmeared sinks unless the filename is specified.
...
Prior to this update, memory is allocated regardless of whether these are requested.
2020-01-24 13:01:06 +00:00
Christopher Kelly
96671bbb24
Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
...
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
Peter Boyle
554542b773
Merge branch 'feature/hdcr' of https://github.com/paboyle/Grid into feature/hdcr
2020-01-06 11:47:56 -05:00
Peter Boyle
03da4040e2
Make summit happy
2020-01-06 11:47:48 -05:00
Peter Boyle
e583035614
Change to interface to minise comms in evaluating coarse space operator
2020-01-06 11:43:59 -05:00
Peter Boyle
3c3d6a94f3
OPtimising the force term a bit
2020-01-04 03:16:23 -05:00
Peter Boyle
205ea4bbb2
More verboose Lanczos
2020-01-04 03:13:40 -05:00
Peter Boyle
039eb7b2eb
Make the force term and coarsening multigrid more optimised
2020-01-04 03:12:17 -05:00
Peter Boyle
f7e4bd1f6d
Getting more optimised
2020-01-04 03:11:53 -05:00
Peter Boyle
0afecfcae7
Nearing well optimised state
2020-01-04 03:11:19 -05:00
Peter Boyle
ba40a3f763
Alternate low pass filter option
2020-01-03 05:29:09 -05:00
Peter Boyle
aa920aa532
Improved DWF multigrid
2019-12-28 10:32:35 -05:00
Peter Boyle
c0d8e4dce5
Improved Multigrid for DWF
2019-12-28 10:32:15 -05:00
Michael Marshall
0ca1992151
Remove warning in tensor layout comparison. Make default names and index names visible for PerambTensor and NoiseTensor
2019-12-20 13:53:27 +00:00
Michael Marshall
df2b0c4e79
Merge branch 'develop' into feature/distil
...
* develop:
Missing conjugate in MooeeInvDag
Allow subspace setup to no converge
fp16 mandatory. Use SFW is not available as hdw
2019-12-20 13:24:59 +00:00
Peter Boyle
9cfd64c604
Coarse grid on GPU, not fast enough yet. Need a 10x
2019-12-17 05:24:45 -05:00
Peter Boyle
e478404291
Tuned up significantly on GPU, but another 10x in coarse space required
2019-12-17 05:03:25 -05:00
Peter Boyle
9aafd20468
Simple block project promote runs faster on GPU
2019-12-17 05:01:39 -05:00
Peter Boyle
5d834486c9
Merge pull request #259 from grid-test-organisation/feature/5d-improvement-fix
...
Missing conjugate in MooeeInvDag
2019-12-16 04:20:37 -05:00
gfilaci
f7373e97a4
Missing conjugate in MooeeInvDag
2019-12-16 10:05:50 +01:00
Peter Boyle
9e15474999
Accelerator loop attempt at speed up
2019-12-14 05:28:16 -05:00
Peter Boyle
152b525a4d
Typo fix
2019-12-13 22:44:42 -05:00
Peter Boyle
d18994eddc
offload more of mgrid to GPU
2019-12-13 22:08:11 -05:00
Peter Boyle
b8bd8cd2ae
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2019-12-13 21:32:10 -05:00
Peter Boyle
736b19485e
Faster set up and some dead code ifdef'ed out
2019-12-13 21:30:48 -05:00
Michael Marshall
c7637a84ad
Documentation tweak for peculiarities of OpenMPI --prefix
2019-12-12 17:00:03 +00:00
Michael Marshall
a7772c827b
Documentation tweak
2019-12-12 16:05:22 +00:00
8e83398861
Merge pull request #257 from AndrewYongZhenNing/develop
...
Added NamedTensor.hpp
2019-12-11 21:36:59 +00:00
David Murphy
843ca9350a
Fix naming conventions to be consistent with Peter
2019-12-11 11:46:18 -05:00
f47b2b6e13
Added NamedTensor.hpp
2019-12-11 15:56:46 +00:00
Peter Boyle
5bfd1470ad
Merge branch 'develop' into feature/hdcr
2019-12-10 21:51:06 -05:00
Peter Boyle
6957b0b58a
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2019-12-10 21:50:42 -05:00
Peter Boyle
d73f0b8618
Verbose for temporary debug
2019-12-10 21:50:06 -05:00
Peter Boyle
0b3a3562c3
Some MPI (summit) create sigusr2, so trap that
2019-12-10 21:49:12 -05:00
Peter Boyle
710fee5d26
Subspace setup testing code
...
and timing verbose
2019-12-10 21:48:42 -05:00
Peter Boyle
bab0bf2e93
Merge branch 'develop' into feature/hdcr
2019-12-10 21:47:41 -05:00