portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-03-12 07:26:12 +00:00

Author	SHA1	Message	Date
Peter Boyle	3c1c51f9aa	Merge branch 'feature/dirichlet-gparity' into feature/dirichlet	2022-08-31 18:25:34 -04:00
Peter Boyle	8cc3c522c3	Merge pull request #409 from giltirn/feature/dirichlet-gparity-stage Import round 5	2022-08-31 18:22:50 -04:00
Peter Boyle	19cc7653fb	Tracing	2022-08-31 16:57:51 -04:00
Peter Boyle	5752538661	Tracing	2022-08-31 16:57:32 -04:00
Peter Boyle	ca40a1b00b	Tracing	2022-08-31 16:54:55 -04:00
Peter Boyle	659fac9dfb	Tracing hook	2022-08-31 16:54:25 -04:00
Peter Boyle	b99453083d	Updated timing	2022-07-28 11:37:02 -04:00
Christopher Kelly	33e4a0caee	Imported changes from feature/gparity_HMC branch: Rework of WilsonFlow class Fixed logic error in smear method where the step index was initialized to 1 rather than 0, resulting in the logged output value of tau being too large by epsilon Previously smear_adaptive would maintain the current value of tau as a class member variable whereas smear would compute it separately; now both methods maintain the current value internally and it is updated by the evolve_step routines. Both evolve methods are now const. smear_adaptive now also maintains the current value of epsilon internally, allowing it to be a const method and also allowing the same class instance to be reused without needing to be reset Replaced the fixed evaluation of the plaquette energy density and plaquette topological charge during the smearing with a highly flexible general strategy where the user can add arbitrary measurements as functional objects that are evaluated at an arbitrary frequency By default the same plaquette-based measurements are performed, but additional example functions are provided where the smearing is performed with different choices of measurement that are returned as an array for further processing Added a method to compute the energy density using the Cloverleaf approach which has smaller discretization errors Added a new tensor utility operation, copyLane, which allows for the copying of a single SIMD lane between two instances of the same tensor type but potentially different precisions To LocalCoherenceLanczos, added the option to compute the high/low eval of the fine operator on every restart to aid in tuning the Chebyshev Added Test_field_array_io which demonstrates and tests a single-file write of an arbitrary array of fields Added Test_evec_compression which generates evecs using Lanczos and attempts to compress them using the local coherence technique Added Test_compressed_lanczos_gparity which demonstrates the local coherence Lanczos for G-parity BCs Added HMC main programs for the 40ID and 48ID G-parity lattices	2022-07-01 14:12:12 -04:00
Peter Boyle	751a4562d7	Timing improvement	2022-07-01 09:41:43 -04:00
Peter Boyle	dc000d10ee	Spelling correction	2022-06-27 12:14:57 -04:00
Peter Boyle	3685f391cf	More verbose CG	2022-06-27 12:11:08 -04:00
Peter Boyle	8208a6214f	Merge branch 'feature/dirichlet-gparity' into feature/dirichlet	2022-06-15 19:23:48 -04:00
Peter Boyle	e9648a1635	Useful periodic print. CG convergence bound is remarkably accurate on low eigenvalue in numerical tests	2022-06-14 23:40:04 -04:00
JPRichings	79e34b3eb4	Local Coherence batch deflation	2022-05-19 14:53:17 +01:00
James Richings	b051e00de0	Additional Local Coherance Deflation operator()	2022-05-16 00:25:13 +01:00
Christopher Kelly	6121397587	Imported changes from feature/gparity_HMC branch: Added storage of final true residual in mixed-prec CG and enhanced log output Fixed const correctness of multi-shift constructor Added a mixed precision variant of the multi-shift algorithm that uses a single precision operator and applies periodic reliable update to the residual Added tests/solver/Test_dwf_multishift_mixedprec to test the above Fixed local coherence lanczos using the (large!) max approx to the chebyshev eval as the scale from which to judge the quality of convergence, resulting a test that always passes Added a method to local coherence lanczos class that returns the fine eval/evec pair Added iterative log output to power method Added optional disabling of the plaquette check in Nerscio to support loading old G-parity configs which have a factor of 2 error in the plaquette G-parity Dirac op no longer allows GPBC in the time direction; instead we toggle between periodic and antiperiodic Replaced thread_for G-parity 5D force insertion implementation with accelerator_for version capable of running on GPUs Generalized tests/lanczos/Test_dwf_lanczos to support regular DWF as well as Gparity, with the action chosen by a command line option Modified tests/forces/Test_dwf_gpforce,Test_gpdwf_force,Test_gpwilson_force to use GPBC a spatial direction rather than the t-direction, and antiperiodic BCs for time direction tests/core/Test_gparity now supports using APBC in time direction using command line toggle	2022-05-09 16:27:57 -04:00
Peter Boyle	a4ce6e42c7	Warning free compile on make all and make tests under nvcc	2021-10-27 00:27:03 +01:00
Peter Boyle	ba7e371b90	Warning free compile on Tursa. Hopefully got all reqd virtual dtors	2021-10-21 19:56:52 +01:00
Peter Boyle	749b8022a4	Linear operator and SparseMatrix virtual destructors	2021-10-15 20:47:18 +01:00
Antonin Portelli	6c66b8d997	deflated guesser can optionally be used with less vectors than provided	2021-09-30 19:25:12 +01:00
Antonin Portelli	9523ad3d73	vector version of Schur solver use vector guesser	2021-09-28 12:45:47 +01:00
Antonin Portelli	73a95fa96f	LinearFunction loops over vectors by default, can be overloaded	2021-09-28 12:44:26 +01:00
Peter Boyle	9d2238148c	Merge branch 'develop' of https://www.github.com/paboyle/Grid into develop	2021-09-15 19:25:57 +01:00
Peter Boyle	c15493218d	Two extra routines to break out SchurRedBlack on many RHS into stages to allow efficient deflation & split grid Split grid solver still to do.	2021-09-15 19:24:39 +01:00
Christoph Lehner	c50f27e68b	Make FFT play nice with split grid	2021-06-20 11:34:38 +02:00
Christoph Lehner	2bb374daea	hip-friendly	2021-03-19 11:33:23 +01:00
Peter Boyle	281ac5fc12	Red black support on coars	2021-01-14 20:48:08 -05:00
Daniel Richtmann	4d2dc7ba03	Enable even-odd for CoarsenedMatrix	2020-09-11 20:32:02 +02:00
Daniel Richtmann	cf3535d16e	Expose more functions in CMat	2020-08-27 14:06:48 +02:00
Daniel Richtmann	b2087f14c4	Fix CoarsenedMatrix regarding illegal memory accesses Need a reference to geom since the lambda copies the this pointer which points to host memory, see - https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture - https://devblogs.nvidia.com/new-compiler-features-cuda-8/	2020-08-24 17:46:47 +02:00
Daniel Richtmann	dd1ba266b2	Fix mapping between dir + disp and point in CMat	2020-08-24 17:46:46 +02:00
Daniel Richtmann	1292d59563	Add a typedef + broaden interface of CMat	2020-08-24 17:46:45 +02:00
Peter Boyle	c48da35921	Memory Vector UVM and Lattice alignedAllocator separate	2020-06-22 20:21:53 -04:00
Peter Boyle	1a74816c25	Hopeefully fixed	2020-06-19 17:50:52 -04:00
Peter Boyle	228fd450ce	Typo fix (excusee - my keyboard is starting to break)	2020-06-19 17:36:05 -04:00
Peter Boyle	b949cf6b12	PeekLocal needs a view to keep thread safe. ALLOCATION_CACHEE reenable	2020-06-19 17:13:27 -04:00
Christoph Lehner	b5e87e8d97	summit compile fixes	2020-06-12 18:16:12 -04:00
Peter Boyle	cdf0a04fc5	Merge branch 'develop' into sycl	2020-06-09 04:00:12 -04:00
Peter Boyle	1a4c8c3387	Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.	2020-06-05 18:52:35 -04:00
Peter Boyle	1c9f20b15e	Views must be closed	2020-06-03 09:10:29 -04:00
Peter Boyle	7860a50f70	Make view specify where and drive data motion - first cut. This is a compile tiime option --enable-unified=yes/no	2020-05-21 16:13:16 -04:00
Peter Boyle	82f71643a4	Remove the norm in MdagM	2020-05-12 17:55:53 -04:00
Peter Boyle	ea08f193e7	Allocator cache spliit into large/small pools	2020-05-10 05:24:26 -04:00
Daniel Richtmann	ab0c5d77fb	Correct NonHermitianSchurOperatorBase	2020-05-08 16:44:02 +02:00
Peter Boyle	f8b8e00090	Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc Aim to reduce the amount of cuda and other code variations floating around all over the place. Will move GpuInit iinto Accelerator.cc from Init.cc Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows	2020-05-08 06:23:55 -07:00
Peter Boyle	1d65e2f62c	Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics Lifteed from HDCR setup	2020-05-08 09:20:54 -04:00
Peter Boyle	21ca182c36	Comments remove	2020-05-08 09:18:24 -04:00
Christoph Lehner	3c6ffcb48c	Merge branch 'develop' into feature/gpt	2020-05-06 15:03:35 +02:00
Christoph Lehner	e9b295f967	Synchronize blocking infrastructure with GPT	2020-05-06 08:42:28 -04:00
Peter Boyle	9b2d2d0fc3	Basis rotate stack passig to GPU reduction	2020-04-30 12:31:07 -04:00

1 2 3

126 Commits