nmeyer-ur
d8cea77707
define simd width in header
2020-04-03 19:22:25 +02:00
nmeyer-ur
5f8a76d490
clean up, reduction in acle
2020-04-03 19:18:24 +02:00
nmeyer-ur
28d49a3b60
build problem resolved
2020-04-03 16:52:48 +02:00
nmeyer-ur
b4c624ece6
added A64FX support
2020-04-03 15:43:23 +02:00
2c22db841a
Added momentum scaling to scalar HMC theories in order to follow UKQCD/CPS conventions
2020-04-02 17:38:47 +01:00
Christoph Lehner
856d168e41
global sum over vectors of uint64_t
2020-03-29 07:56:05 -04:00
Christoph Lehner
b6cbdd2aa3
Merge pull request #1 from DanielRichtmann/feature/read-openqcd
...
Feature/read openqcd
2020-03-26 17:39:04 +01:00
Christoph Lehner
a2188ea875
remove debugging printf from WilsonKernelsImplementation
2020-03-26 09:12:36 -04:00
Daniel Richtmann
989af65807
Check in parallel reader for openqcd configs
2020-03-24 11:20:54 +01:00
Christoph Lehner
c9b737a4e7
make trace,adj,transpose unary operators
2020-03-16 17:58:30 -04:00
Daniel Richtmann
037bb6ea73
Check in reader for openqcd configs
...
This reader is suboptimal in the sense that it opens the entire config on every MPI rank.
2020-03-16 14:28:02 +01:00
Carleton DeTar
165c68e28e
Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift
2020-02-29 17:51:51 -06:00
Carleton DeTar
9479bc8486
Make IterationsToComplete and TrueResidual externally accessible
2020-02-19 17:43:57 -06:00
Peter Boyle
8a5c13d5fb
Still fast moving in changes
2020-02-06 17:57:26 -05:00
Peter Boyle
bdccb0c91f
Working 2 types of decomposition
2020-02-06 17:26:55 -05:00
Peter Boyle
68b45f6444
Lower left/upper right region cut paste
2020-02-06 15:50:26 -05:00
Peter Boyle
ef9b3e658a
extra typedef
2020-02-06 15:47:14 -05:00
Peter Boyle
b9ca40cc44
More precise power method at start
2020-02-06 10:09:14 -05:00
Peter Boyle
2f421a5db1
Commeent fix
2020-02-06 10:08:27 -05:00
Michael Marshall
c69a3b6ef6
When saving eigenvectors, LapEvec now saves eigenvalues for every timeslice as well.
...
I.e. nT x nVec eigenvalues are saved in FileName.evals.conf.h5.
A new named tensor, "TimesliceEvals" can be used to simplify restoring these from disk.
NB: The changes in BaseIO add support so that Eigen tensors can be easily used in MPI operations, e.g. GlobalSum.
See LapEvec.hpp for an example of how this is done.
2020-01-29 21:20:20 +00:00
Peter Boyle
2b5de5bba5
MdagM operator without norm option
2020-01-27 13:44:30 -05:00
Peter Boyle
2e85cae74e
Add Jacobi polynomials
2020-01-27 13:43:49 -05:00
Peter Boyle
76c823781e
Much faster coarsening
2020-01-27 13:43:19 -05:00
Peter Boyle
114db3b99d
Optional MdagM without norms
2020-01-27 13:42:51 -05:00
Peter Boyle
49e123dbda
Use explicit linalg calls to get coalesce optimisations on GPU
2020-01-27 12:44:51 -05:00
Peter Boyle
8cec294ec9
Make CG a bit less verbose as gettign annoying in nested algorithms.
...
Can use Iterative logging if you want to see more
2020-01-27 12:44:04 -05:00
Peter Boyle
eb5b720e94
Normal Equations can be used in HDCR now
2020-01-27 12:43:29 -05:00
Peter Boyle
b2736ec80b
Make PrecGCR recursive - it can precondition itself
2020-01-27 12:42:48 -05:00
Peter Boyle
086256a032
Less sloppy convergence test on PowerMethod
2020-01-27 12:41:59 -05:00
Peter Boyle
afc7426f39
Much bigger pointer cache in case of Nvidia due to cost of setting up UVM allocations
2020-01-27 12:41:16 -05:00
Peter Boyle
7c061e20c9
All directions of dirac operator for fastt coarsening
2020-01-27 12:40:13 -05:00
Peter Boyle
e5d1c09665
Faster DhopDirAll for little dirac operator coarsening
2020-01-27 12:38:54 -05:00
Peter Boyle
8016a465ae
Remove extraneous variable
2020-01-27 12:35:37 -05:00
Peter Boyle
d8b9742092
DhopDirAll for faster matrix elements of little Dirac operator
2020-01-27 12:34:54 -05:00
Peter Boyle
1bd87c35d7
Read coalescing on Nvidia
2020-01-27 12:29:56 -05:00
Peter Boyle
fa856c9669
Disable information message
2020-01-27 12:28:46 -05:00
Peter Boyle
48008e4d8b
Thread coordinate creation loop
2020-01-27 12:28:16 -05:00
Peter Boyle
55cdb17691
Integer divide for blocking
2020-01-27 12:27:45 -05:00
Christopher Kelly
96671bbb24
Added ability to pass callback to MADWF that is called every inner iteration and allows user to, for example, adjust the inner solver tolerance depending on residual
...
Added a general implementation of the Remez algorithm for producing arbitrary rational polynomial approximation with optional restriction to even/odd polynomials
Added implementation of computation of ZMobius parameters
Added Test_zMADWF_prec to test ZMobius in MADWF
2020-01-17 12:45:30 -08:00
Peter Boyle
e583035614
Change to interface to minise comms in evaluating coarse space operator
2020-01-06 11:43:59 -05:00
Peter Boyle
3c3d6a94f3
OPtimising the force term a bit
2020-01-04 03:16:23 -05:00
Peter Boyle
205ea4bbb2
More verboose Lanczos
2020-01-04 03:13:40 -05:00
Peter Boyle
039eb7b2eb
Make the force term and coarsening multigrid more optimised
2020-01-04 03:12:17 -05:00
Peter Boyle
f7e4bd1f6d
Getting more optimised
2020-01-04 03:11:53 -05:00
Peter Boyle
ba40a3f763
Alternate low pass filter option
2020-01-03 05:29:09 -05:00
Peter Boyle
c0d8e4dce5
Improved Multigrid for DWF
2019-12-28 10:32:15 -05:00
Michael Marshall
0ca1992151
Remove warning in tensor layout comparison. Make default names and index names visible for PerambTensor and NoiseTensor
2019-12-20 13:53:27 +00:00
Peter Boyle
9cfd64c604
Coarse grid on GPU, not fast enough yet. Need a 10x
2019-12-17 05:24:45 -05:00
Peter Boyle
9aafd20468
Simple block project promote runs faster on GPU
2019-12-17 05:01:39 -05:00
gfilaci
f7373e97a4
Missing conjugate in MooeeInvDag
2019-12-16 10:05:50 +01:00