portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2025-06-17 07:17:06 +01:00

Author	SHA1	Message	Date
Peter Boyle	7db8dd7a95	Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet	2023-03-21 16:04:27 -04:00
Christopher Kelly	e82cf1d311	Further prec-change improvements Mixed prec CG algorithm has been modified to precompute precision change workspaces As the original Test_dwf_mixedcg_prec has been coopted to do a performance stability and reproducibility test, requiring the single-prec CG to be run 200 times, I have created a new version of Test_dwf_mixedcg_prec in the solver subdirectory that just does the mixed vs double CG test	2023-02-23 09:45:29 -05:00
Christopher Kelly	1db58a8acc	Precision change improvements Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation. In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes. Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces Renamed the original precisionChange as precisionChangeOrig Fixed incorrect pointer offset bug in copyLane Added a test and a benchmark for precisionChange Added a test for reliable update CG	2023-02-21 10:52:42 -05:00
Raoul Hodgson	920a51438d	Added batched Mixed precision CG	2023-02-14 17:04:13 +00:00
Chulwoo Jung	dc6a38f177	Minor cleanup	2022-11-30 17:13:12 -05:00
Chulwoo Jung	82c1ecf60f	Block lanczos added	2022-11-30 16:08:40 -05:00
Peter Boyle	c82b164f6b	Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet	2022-10-04 17:41:48 -04:00
Peter Boyle	5b128a6f9f	MixedPrec Multishift with better precision scheme for GPU	2022-09-23 16:18:47 -04:00
Peter Boyle	3c1c51f9aa	Merge branch 'feature/dirichlet-gparity' into feature/dirichlet	2022-08-31 18:25:34 -04:00
Peter Boyle	8cc3c522c3	Merge pull request #409 from giltirn/feature/dirichlet-gparity-stage Import round 5	2022-08-31 18:22:50 -04:00
Peter Boyle	19cc7653fb	Tracing	2022-08-31 16:57:51 -04:00
Peter Boyle	5752538661	Tracing	2022-08-31 16:57:32 -04:00
Peter Boyle	ca40a1b00b	Tracing	2022-08-31 16:54:55 -04:00
Peter Boyle	659fac9dfb	Tracing hook	2022-08-31 16:54:25 -04:00
Peter Boyle	b99453083d	Updated timing	2022-07-28 11:37:02 -04:00
Christopher Kelly	33e4a0caee	Imported changes from feature/gparity_HMC branch: Rework of WilsonFlow class Fixed logic error in smear method where the step index was initialized to 1 rather than 0, resulting in the logged output value of tau being too large by epsilon Previously smear_adaptive would maintain the current value of tau as a class member variable whereas smear would compute it separately; now both methods maintain the current value internally and it is updated by the evolve_step routines. Both evolve methods are now const. smear_adaptive now also maintains the current value of epsilon internally, allowing it to be a const method and also allowing the same class instance to be reused without needing to be reset Replaced the fixed evaluation of the plaquette energy density and plaquette topological charge during the smearing with a highly flexible general strategy where the user can add arbitrary measurements as functional objects that are evaluated at an arbitrary frequency By default the same plaquette-based measurements are performed, but additional example functions are provided where the smearing is performed with different choices of measurement that are returned as an array for further processing Added a method to compute the energy density using the Cloverleaf approach which has smaller discretization errors Added a new tensor utility operation, copyLane, which allows for the copying of a single SIMD lane between two instances of the same tensor type but potentially different precisions To LocalCoherenceLanczos, added the option to compute the high/low eval of the fine operator on every restart to aid in tuning the Chebyshev Added Test_field_array_io which demonstrates and tests a single-file write of an arbitrary array of fields Added Test_evec_compression which generates evecs using Lanczos and attempts to compress them using the local coherence technique Added Test_compressed_lanczos_gparity which demonstrates the local coherence Lanczos for G-parity BCs Added HMC main programs for the 40ID and 48ID G-parity lattices	2022-07-01 14:12:12 -04:00
Peter Boyle	751a4562d7	Timing improvement	2022-07-01 09:41:43 -04:00
Peter Boyle	dc000d10ee	Spelling correction	2022-06-27 12:14:57 -04:00
Peter Boyle	3685f391cf	More verbose CG	2022-06-27 12:11:08 -04:00
Peter Boyle	8208a6214f	Merge branch 'feature/dirichlet-gparity' into feature/dirichlet	2022-06-15 19:23:48 -04:00
Peter Boyle	e9648a1635	Useful periodic print. CG convergence bound is remarkably accurate on low eigenvalue in numerical tests	2022-06-14 23:40:04 -04:00
JPRichings	79e34b3eb4	Local Coherence batch deflation	2022-05-19 14:53:17 +01:00
James Richings	b051e00de0	Additional Local Coherance Deflation operator()	2022-05-16 00:25:13 +01:00
Christopher Kelly	6121397587	Imported changes from feature/gparity_HMC branch: Added storage of final true residual in mixed-prec CG and enhanced log output Fixed const correctness of multi-shift constructor Added a mixed precision variant of the multi-shift algorithm that uses a single precision operator and applies periodic reliable update to the residual Added tests/solver/Test_dwf_multishift_mixedprec to test the above Fixed local coherence lanczos using the (large!) max approx to the chebyshev eval as the scale from which to judge the quality of convergence, resulting a test that always passes Added a method to local coherence lanczos class that returns the fine eval/evec pair Added iterative log output to power method Added optional disabling of the plaquette check in Nerscio to support loading old G-parity configs which have a factor of 2 error in the plaquette G-parity Dirac op no longer allows GPBC in the time direction; instead we toggle between periodic and antiperiodic Replaced thread_for G-parity 5D force insertion implementation with accelerator_for version capable of running on GPUs Generalized tests/lanczos/Test_dwf_lanczos to support regular DWF as well as Gparity, with the action chosen by a command line option Modified tests/forces/Test_dwf_gpforce,Test_gpdwf_force,Test_gpwilson_force to use GPBC a spatial direction rather than the t-direction, and antiperiodic BCs for time direction tests/core/Test_gparity now supports using APBC in time direction using command line toggle	2022-05-09 16:27:57 -04:00
Peter Boyle	a4ce6e42c7	Warning free compile on make all and make tests under nvcc	2021-10-27 00:27:03 +01:00
Peter Boyle	ba7e371b90	Warning free compile on Tursa. Hopefully got all reqd virtual dtors	2021-10-21 19:56:52 +01:00
Antonin Portelli	6c66b8d997	deflated guesser can optionally be used with less vectors than provided	2021-09-30 19:25:12 +01:00
Antonin Portelli	9523ad3d73	vector version of Schur solver use vector guesser	2021-09-28 12:45:47 +01:00
Peter Boyle	c15493218d	Two extra routines to break out SchurRedBlack on many RHS into stages to allow efficient deflation & split grid Split grid solver still to do.	2021-09-15 19:24:39 +01:00
Peter Boyle	c48da35921	Memory Vector UVM and Lattice alignedAllocator separate	2020-06-22 20:21:53 -04:00
Peter Boyle	cdf0a04fc5	Merge branch 'develop' into sycl	2020-06-09 04:00:12 -04:00
Peter Boyle	1a4c8c3387	Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.	2020-06-05 18:52:35 -04:00
Peter Boyle	1c9f20b15e	Views must be closed	2020-06-03 09:10:29 -04:00
Peter Boyle	7860a50f70	Make view specify where and drive data motion - first cut. This is a compile tiime option --enable-unified=yes/no	2020-05-21 16:13:16 -04:00
Peter Boyle	f8b8e00090	Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc Aim to reduce the amount of cuda and other code variations floating around all over the place. Will move GpuInit iinto Accelerator.cc from Init.cc Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows	2020-05-08 06:23:55 -07:00
Christoph Lehner	3c6ffcb48c	Merge branch 'develop' into feature/gpt	2020-05-06 15:03:35 +02:00
Christoph Lehner	e9b295f967	Synchronize blocking infrastructure with GPT	2020-05-06 08:42:28 -04:00
Peter Boyle	9b2d2d0fc3	Basis rotate stack passig to GPU reduction	2020-04-30 12:31:07 -04:00
Peter Boyle	90229cfb0f	Merge pull request #270 from milc-qcd/feature/CGinfo feature/CGinfo	2020-04-16 11:46:08 -04:00
Peter Boyle	0475c46ecb	Merge pull request #256 from djm2131/feature/BiCGSTAB Import BiCGSTAB solvers and tests	2020-04-16 11:45:15 -04:00
Peter Boyle	11dec4883c	Don't throw assert	2020-04-10 11:09:11 -04:00
Peter Boyle	afa458c812	Extra solvers	2020-04-10 11:08:19 -04:00
Peter Boyle	dc50190b8f	Faster GPU basis rotation May need to later include Regensburg optimised CPU variant	2020-04-10 11:06:04 -04:00
Carleton DeTar	165c68e28e	Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift	2020-02-29 17:51:51 -06:00
Carleton DeTar	9479bc8486	Make IterationsToComplete and TrueResidual externally accessible	2020-02-19 17:43:57 -06:00
Peter Boyle	b9ca40cc44	More precise power method at start	2020-02-06 10:09:14 -05:00
Peter Boyle	8cec294ec9	Make CG a bit less verbose as gettign annoying in nested algorithms. Can use Iterative logging if you want to see more	2020-01-27 12:44:04 -05:00
Peter Boyle	eb5b720e94	Normal Equations can be used in HDCR now	2020-01-27 12:43:29 -05:00
Peter Boyle	b2736ec80b	Make PrecGCR recursive - it can precondition itself	2020-01-27 12:42:48 -05:00
Peter Boyle	086256a032	Less sloppy convergence test on PowerMethod	2020-01-27 12:41:59 -05:00

1 2 3

133 Commits