portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2025-08-16 19:21:53 +01:00

Author	SHA1	Message	Date
Peter Boyle	3668e81c5e	Extract slice working on checkerboard field for Block Lanczos	2024-08-27 11:31:30 -04:00
Peter Boyle	44b466e072	Make InsertSliceFast the default at some point in future. Should I do this now?	2024-02-21 14:51:24 -05:00
Peter Boyle	addc638856	Fast localCopyRegion, blockProjectFast	2024-01-22 17:40:38 -05:00
Peter Boyle	ca5ae8a2e6	Revert to working.	2024-01-17 16:32:05 -05:00
Peter Boyle	b7c7000d0d	Don't need the numerical rounding tolerance in multigrid	2023-12-22 18:10:23 -05:00
Peter Boyle	9feb801bb9	Much simpler GPU implementation	2023-12-21 15:24:06 -05:00
Peter Boyle	48d1f0df89	Optimised partially, working	2023-12-21 12:33:47 -05:00
Peter Boyle	2c54be651c	Further updates	2023-11-29 09:43:29 -05:00
Peter Boyle	0a3682ad0b	MultiRHS work	2023-11-28 07:43:37 -05:00
Peter Boyle	031f85247c	multRHS initial support -- needs optimisation for multi project/promote. Bug fix in freeing intermediate grids to stop double free	2023-11-23 18:18:35 -05:00
Peter Boyle	aa5047a9e4	Faster blockProject blockPromote	2023-10-24 10:49:55 -04:00
Peter Boyle	3bc2da5321	Merge branch 'feature/scidac-wp1' of https://github.com/paboyle/Grid into feature/scidac-wp1	2023-10-05 16:57:59 -04:00
Peter Boyle	59b9d0e030	coalesceRead the blockSum	2023-10-05 16:54:48 -04:00
Peter Boyle	6a87487544	Running on Frontier, fix RNG big volume y2k, affecting 5D RNG	2023-10-05 16:50:59 -04:00
Christopher Kelly	f44dce390f	Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding Fixed const correctness on PaddedCell methods Fixed compile issues on Crusher Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance Optimized kernel for StaplePadded Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings	2023-06-27 14:58:10 -04:00
Peter Boyle	9c8750f261	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2023-05-11 12:29:09 -04:00
Peter Boyle	f534523ede	Debug	2023-05-11 12:23:11 -04:00
Peter Boyle	2376156fbc	Merge branch 'develop' into feature/dirichlet	2023-03-27 21:33:50 -07:00
Raoul Hodgson	a3e935c902	Batched block project/promote size checks	2023-02-27 11:38:16 +00:00
Christopher Kelly	1db58a8acc	Precision change improvements Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation. In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes. Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces Renamed the original precisionChange as precisionChangeOrig Fixed incorrect pointer offset bug in copyLane Added a test and a benchmark for precisionChange Added a test for reliable update CG	2023-02-21 10:52:42 -05:00
Raoul Hodgson	be528b6d27	Add batched block project/promote functions	2023-02-14 14:37:10 +00:00
Peter Boyle	204c283e16	Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet	2022-10-11 14:59:07 -04:00
Peter Boyle	551a5f8dc8	RRII gpu option	2022-10-11 14:44:55 -04:00
Peter Boyle	c82b164f6b	Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet	2022-10-04 17:41:48 -04:00
Peter Boyle	7ffbc3e98e	Double2 improved. REally don't like 'convertType' - localise to a GPT header	2022-09-27 10:35:31 -04:00
Peter Boyle	e4c117aabf	Compile fix, multishift mixed prec support	2022-09-23 16:19:27 -04:00
Christopher Kelly	1ad54d049d	To PeriodicBC and ConjugateBC, added a new function "CshiftLink" which performs a boundary-aware C-shift of links or products of links. For the latter, the links crossing the global boundary are complex-conjugated. To the gauge implementations, added CshiftLink functions calling into the appropriate operation for the BC in a given direction. GaugeTransform, FourierAcceleratedGaugeFixer and WilsonLoops::FieldStrength no longer implicitly assume periodic boundary conditions; instead the shifted link is obtained using CshiftLink and is aware of the gauge implementation. Added an assert-check to ensure that the gauge fixing converges within the specified number of steps. Added functionality to compute the timeslice averaged plaquette Added functionality to compute the 5LI topological charge and timeslice topological charge Added a check of the properties of the charge conjugation matrix C=-gamma_2 gamma_4 to Test_gamma Fixed const correctness for Replicate Modified Test_fft_gfix to support either conjugate or periodic BCs, optionally disabling Fourier-accelerated gauge fixing, and tuning of alpha using cmdline options	2022-06-02 15:30:41 -04:00
Henrique B.R	7e130076d6	Fixed line left behind	2021-09-24 17:26:31 +01:00
Henrique B.R	a822c48565	Added accelerated pick-set checkerboard functions	2021-09-24 17:13:25 +01:00
Christoph Lehner	e2abbf9520	Merge pull request #25 from paboyle/develop Sync	2021-09-15 10:02:43 +02:00
Christoph Lehner	2bb374daea	hip-friendly	2021-03-19 11:33:23 +01:00
Michael Marshall	3215d88a91	Simplify syntax with Grid::EnableIf post code review. Updated EnableIf so that ReturnType defaults to void in same way as std::enable_if see https://en.cppreference.com/w/cpp/types/enable_if	2021-02-03 15:17:03 +00:00
Michael Marshall	77063418da	Fix issue for GPU by ensuring accelerator_inline version of convertType is available for Grid::complex<T>. This removes many warnings in Hadrons Simplify the SFINAE syntax and correct convertType for iScalar	2021-01-25 15:09:36 +00:00
Christoph Lehner	f0dc0f3621	fix compile issue on Qpace3	2020-08-22 13:57:33 +02:00
Christoph Lehner	dbaa24ebf6	further GPU memory access fixes (with this GPT passes all single-rank tests on non-summit GPUs)	2020-08-13 16:14:15 +02:00
Peter Boyle	b949cf6b12	PeekLocal needs a view to keep thread safe. ALLOCATION_CACHEE reenable	2020-06-19 17:13:27 -04:00
Christoph Lehner	b5e87e8d97	summit compile fixes	2020-06-12 18:16:12 -04:00
Peter Boyle	a7ffc61e82	acceleratorSIMTlane()	2020-06-10 19:58:33 -04:00
Peter Boyle	cdf0a04fc5	Merge branch 'develop' into sycl	2020-06-09 04:00:12 -04:00
Peter Boyle	1a4c8c3387	Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.	2020-06-05 18:52:35 -04:00
Peter Boyle	7860a50f70	Make view specify where and drive data motion - first cut. This is a compile tiime option --enable-unified=yes/no	2020-05-21 16:13:16 -04:00
Christoph Lehner	e9b295f967	Synchronize blocking infrastructure with GPT	2020-05-06 08:42:28 -04:00
Peter Boyle	6cdb09c884	Faster copy region	2020-04-10 11:10:52 -04:00
Peter Boyle	68b45f6444	Lower left/upper right region cut paste	2020-02-06 15:50:26 -05:00
Peter Boyle	1bd87c35d7	Read coalescing on Nvidia	2020-01-27 12:29:56 -05:00
Peter Boyle	9aafd20468	Simple block project promote runs faster on GPU	2019-12-17 05:01:39 -05:00
Peter Boyle	9e15474999	Accelerator loop attempt at speed up	2019-12-14 05:28:16 -05:00
Peter Boyle	152b525a4d	Typo fix	2019-12-13 22:44:42 -05:00
Peter Boyle	d18994eddc	offload more of mgrid to GPU	2019-12-13 22:08:11 -05:00
Peter Boyle	6b692aa726	Thread loops	2019-06-15 08:02:26 +01:00

1 2

55 Commits