2c22db841a
Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions
2020-04-02 17:38:47 +01:00
856d168e41
global sum over vectors of uint64_t
2020-03-29 07:56:05 -04:00
b6cbdd2aa3
Merge pull request #1 from DanielRichtmann/feature/read-openqcd
...
Feature/read openqcd
2020-03-26 17:39:04 +01:00
a2188ea875
remove debugging printf from WilsonKernelsImplementation
2020-03-26 09:12:36 -04:00
989af65807
Check in parallel reader for openqcd configs
2020-03-24 11:20:54 +01:00
c9b737a4e7
make trace,adj,transpose unary operators
2020-03-16 17:58:30 -04:00
037bb6ea73
Check in reader for openqcd configs
...
This reader is suboptimal in the sense that it opens the entire config on every MPI rank.
2020-03-16 14:28:02 +01:00
165c68e28e
Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift
2020-02-29 17:51:51 -06:00
9479bc8486
Make IterationsToComplete and TrueResidual externally accessible
2020-02-19 17:43:57 -06:00
8a5c13d5fb
Still fast moving in changes
2020-02-06 17:57:26 -05:00
bdccb0c91f
Working 2 types of decomposition
2020-02-06 17:26:55 -05:00
68b45f6444
Lower left/upper right region cut paste
2020-02-06 15:50:26 -05:00
ef9b3e658a
extra typedef
2020-02-06 15:47:14 -05:00
b9ca40cc44
More precise power method at start
2020-02-06 10:09:14 -05:00
2f421a5db1
Commeent fix
2020-02-06 10:08:27 -05:00
c69a3b6ef6
When saving eigenvectors, LapEvec now saves eigenvalues for every timeslice as well.
...
I.e. nT x nVec eigenvalues are saved in FileName.evals.conf.h5.
A new named tensor, "TimesliceEvals" can be used to simplify restoring these from disk.
NB: The changes in BaseIO add support so that Eigen tensors can be easily used in MPI operations, e.g. GlobalSum.
See LapEvec.hpp for an example of how this is done.
2020-01-29 21:20:20 +00:00
2b5de5bba5
MdagM operator without norm option
2020-01-27 13:44:30 -05:00
2e85cae74e
Add Jacobi polynomials
2020-01-27 13:43:49 -05:00
76c823781e
Much faster coarsening
2020-01-27 13:43:19 -05:00
114db3b99d
Optional MdagM without norms
2020-01-27 13:42:51 -05:00
49e123dbda
Use explicit linalg calls to get coalesce optimisations on GPU
2020-01-27 12:44:51 -05:00
8cec294ec9
Make CG a bit less verbose as gettign annoying in nested algorithms.
...
Can use Iterative logging if you want to see more
2020-01-27 12:44:04 -05:00
eb5b720e94
Normal Equations can be used in HDCR now
2020-01-27 12:43:29 -05:00
b2736ec80b
Make PrecGCR recursive - it can precondition itself
2020-01-27 12:42:48 -05:00
086256a032
Less sloppy convergence test on PowerMethod
2020-01-27 12:41:59 -05:00
afc7426f39
Much bigger pointer cache in case of Nvidia due to cost of setting up UVM allocations
2020-01-27 12:41:16 -05:00
7c061e20c9
All directions of dirac operator for fastt coarsening
2020-01-27 12:40:13 -05:00
e5d1c09665
Faster DhopDirAll for little dirac operator coarsening
2020-01-27 12:38:54 -05:00
8016a465ae
Remove extraneous variable
2020-01-27 12:35:37 -05:00
d8b9742092
DhopDirAll for faster matrix elements of little Dirac operator
2020-01-27 12:34:54 -05:00
1bd87c35d7
Read coalescing on Nvidia
2020-01-27 12:29:56 -05:00
fa856c9669
Disable information message
2020-01-27 12:28:46 -05:00
48008e4d8b
Thread coordinate creation loop
2020-01-27 12:28:16 -05:00
55cdb17691
Integer divide for blocking
2020-01-27 12:27:45 -05:00
96671bbb24
Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
...
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
e583035614
Change to interface to minise comms in evaluating coarse space operator
2020-01-06 11:43:59 -05:00
3c3d6a94f3
OPtimising the force term a bit
2020-01-04 03:16:23 -05:00
205ea4bbb2
More verboose Lanczos
2020-01-04 03:13:40 -05:00
039eb7b2eb
Make the force term and coarsening multigrid more optimised
2020-01-04 03:12:17 -05:00
f7e4bd1f6d
Getting more optimised
2020-01-04 03:11:53 -05:00
ba40a3f763
Alternate low pass filter option
2020-01-03 05:29:09 -05:00
c0d8e4dce5
Improved Multigrid for DWF
2019-12-28 10:32:15 -05:00
0ca1992151
Remove warning in tensor layout comparison. Make default names and index names visible for PerambTensor and NoiseTensor
2019-12-20 13:53:27 +00:00
9cfd64c604
Coarse grid on GPU, not fast enough yet. Need a 10x
2019-12-17 05:24:45 -05:00
9aafd20468
Simple block project promote runs faster on GPU
2019-12-17 05:01:39 -05:00
f7373e97a4
Missing conjugate in MooeeInvDag
2019-12-16 10:05:50 +01:00
9e15474999
Accelerator loop attempt at speed up
2019-12-14 05:28:16 -05:00
152b525a4d
Typo fix
2019-12-13 22:44:42 -05:00
d18994eddc
offload more of mgrid to GPU
2019-12-13 22:08:11 -05:00
736b19485e
Faster set up and some dead code ifdef'ed out
2019-12-13 21:30:48 -05:00