portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-18 08:03:27 +01:00

Author	SHA1	Message	Date
Peter Boyle	c9bb1bf8ea	Passing new BLAs based	2023-12-21 18:31:17 -05:00
Peter Boyle	9e489887cf	General coarse multiRHS move to BLAS implementation	2023-12-21 15:24:48 -05:00
Peter Boyle	abcd6b8cb6	Faster version	2023-12-19 15:17:46 -05:00
Peter Boyle	6835a7f208	Better logging, test on 81 point stencil	2023-11-29 19:20:47 -05:00
Peter Boyle	f59993b979	Nbasis§	2023-11-29 09:47:36 -05:00
Peter Boyle	e859a199df	Reduce volume to interior for coarse stencil -- worth up to 4x gain	2023-11-28 10:23:16 -05:00
Peter Boyle	0a3682ad0b	MultiRHS work	2023-11-28 07:43:37 -05:00
Peter Boyle	59abaeb5cd	Time stamp	2023-11-24 12:56:45 -05:00
Peter Boyle	b302ad3d49	multiRHS test in place, passes Yay!	2023-11-23 18:20:15 -05:00
Peter Boyle	09946cf1ba	Improved, works on 48^3 moving to multiRHS optimisations	2023-11-15 18:03:05 -05:00
Peter Boyle	9c9c42d0df	Tests on frontier with real speed up . 3.5x on 16^3 at mq=0.01	2023-10-20 19:27:13 -04:00
Peter Boyle	0ae4478cd9	Checkpoint the subspace and ldop	2023-10-20 19:27:13 -04:00
Peter Boyle	ae4e705e09	Use random vec as easier for debug	2023-10-20 19:27:13 -04:00
Peter Boyle	2111e7ab5f	Run at physical mass	2023-10-06 21:20:21 -04:00
Peter Boyle	a751c42cc5	Checkpoint restore the setup	2023-10-06 21:03:08 -04:00
Peter Boyle	b58fd80379	I/O for coarse op and reorganise multigrid headers	2023-10-06 13:43:46 -04:00
Peter Boyle	3bc2da5321	Merge branch 'feature/scidac-wp1' of https://github.com/paboyle/Grid into feature/scidac-wp1	2023-10-05 16:57:59 -04:00
Peter Boyle	2d710d6bfd	Optimised parameters for 16^3	2023-10-05 16:56:55 -04:00
Peter Boyle	6532b7f32b	Eliminate older inefficient coarsening implementation	2023-10-05 16:56:15 -04:00
Peter Boyle	fcf5023845	Running on Frontier	2023-10-05 16:50:59 -04:00
Peter Boyle	737d3ffb98	ADEF1 and 1 hop projection	2023-10-03 14:22:18 -04:00
Peter Boyle	8a70314f54	Merge branch 'develop' into feature/scidac-wp1	2023-10-02 17:24:55 -04:00
Peter Boyle	c5f1420dea	Merge remote-tracking branch 'LupoA/develop' into LupoA-develop	2023-10-02 16:22:35 -04:00
Peter BoyleandGitHub	018e6da872	Merge pull request #440 from giltirn/feature/paddedcellgauge Feature/paddedcellgauge	2023-10-02 10:00:42 -04:00
Peter Boyle	e187bcb85c	Updating	2023-09-29 17:10:17 -04:00
Peter Boyle	be18ffe3b4	Further tuning and lanczos	2023-09-27 16:21:58 -04:00
Peter Boyle	3a86cce8c1	Compile	2023-09-27 16:19:18 -04:00
Peter Boyle	37884d369f	Coarse space is expensive, but gives a speed up in fine matrix multiplies now. Down to optimisation	2023-09-25 17:24:19 -04:00
Peter Boyle	9246e653cd	Basic non-local coarsening of operator test	2023-09-25 17:20:58 -04:00
Peter Boyle	b9dcad89e8	Test cases for coarsening with non-local stencil	2023-09-07 10:53:22 -04:00
Peter Boyle	2b43308208	First cut non-local coarsening	2023-08-25 17:38:07 -04:00
Peter Boyle	b8a7004365	Partial fraction test	2023-08-14 15:17:03 -04:00
Julian Lenz	f7b79cdd45	Added test for ProjectSpn	2023-07-03 18:00:32 +01:00
Alessandro Lupo	b92428f05f	better test	2023-07-02 13:34:03 +01:00
Alessandro Lupo	34b11864b6	prettiest tests	2023-07-02 13:25:57 +01:00
Christopher Kelly	f44dce390f	Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding Fixed const correctness on PaddedCell methods Fixed compile issues on Crusher Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance Optimized kernel for StaplePadded Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings	2023-06-27 14:58:10 -04:00
Christopher Kelly	6f6844ccf1	Added new StapleAll and RectStapleAll functions that return the staples for all mu as an array Modified plaq+rectangle gauge actions to use the above Added a test code to confirm the above changes	2023-06-26 15:48:47 -04:00
Christopher Kelly	4c6613d72c	Modified RectStapleDouble and RectStapleOptimised to use Gauge-BC respecting CshiftLink Added test code tests/debug/Test_optimized_staple_gaugebc demonstrating equivalence of above to RectStapleUnoptimised for cconj gauge BCs Removed optimized staple only being used for periodic gauge BCs; it is now always used	2023-06-26 10:20:23 -04:00
Alessandro Lupo	cff1f8d3b8	rm unused variables and formatting	2023-06-23 16:04:18 +01:00
Alessandro Lupo	f27d2083cd	adjustments in SUn and Sp2n impl	2023-06-23 15:34:08 +01:00
Alessandro Lupo	de30c4e22a	minor improvements	2023-06-23 10:49:41 +01:00
Christopher Kelly	4241c7d4a3	Imported coalescedReadGeneralPermute GPU implementation from Christoph Fixed bug in padded staple code where extract was being called on the result before the GPU view was closed Fixed compile issue with pointer cast in padded staple code Added timing summaries of padded staple code and timing breakdown of staple implementation to Test_padded_cell_staple	2023-06-21 16:01:01 -04:00
Christopher Kelly	7b11075102	The user can now specify the implementation of Cshift used by the PaddedCell class through a virtual base class API. Implementations for default (regular Cshift) and for gauge links (which respects the gauge BCs) Fixed const-correctness for PaddedCell and ConjugateGimpl::setDirections Modified test code for padded-cell implementation of staple, rect-staple to use cconj BCs	2023-06-20 17:09:56 -04:00
Christopher Kelly	abc658dca5	Added coalescedReadGeneralPermute CPU implementation based on Christoph's GPT code In a test code, implemented a padded-cell version of the staple and rectangular-staple calculation	2023-06-20 16:14:25 -04:00
Alessandro LupoandGitHub	2372275b2c	Merge pull request #36 from LupoA/sp2n/gpu-bugfix Sp2n/gpu bugfix [close #30]	2023-06-20 13:46:00 +01:00
Julian Lenz	5e539e2d54	Forgot some follow-ups on changed signature	2023-06-18 12:37:51 +01:00
Julian Lenz	621e612c30	Fix non-zero ret on device bug	2023-06-16 16:27:49 +01:00
Julian Lenz	8c3792721b	ClangFormat	2023-06-16 15:58:23 +01:00
Alessandro Lupo	c797cbe737	deal with post-merge trauma	2023-06-16 14:20:37 +01:00
Alessandro Lupo	e09dfbf1c2	definetely the right merge upstream/develop	2023-06-16 14:19:46 +01:00

1 2 3 4 5 ...