Peter Boyle
							
						 
					 | 
					
						
						
							
						
						c9bb1bf8ea
					 | 
					
						
						
							
							Passing new BLAs based
						
						
						
						
						
						
					 | 
					
						2023-12-21 18:31:17 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						9e489887cf
					 | 
					
						
						
							
							General coarse multiRHS move to BLAS implementation
						
						
						
						
						
						
					 | 
					
						2023-12-21 15:24:48 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						9feb801bb9
					 | 
					
						
						
							
							Much simpler GPU implementation
						
						
						
						
						
						
					 | 
					
						2023-12-21 15:24:06 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						c00b495933
					 | 
					
						
						
							
							Multigrid
						
						
						
						
						
						
					 | 
					
						2023-12-21 15:23:31 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						d22eebe553
					 | 
					
						
						
							
							BLas options
						
						
						
						
						
						
					 | 
					
						2023-12-21 15:23:03 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						8bcbd82680
					 | 
					
						
						
							
							BLAS based layout and implementation
						
						
						
						
						
						
					 | 
					
						2023-12-21 15:21:24 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						dfa617c439
					 | 
					
						
						
							
							Batched SGEMM/DGEMM/ZGEMM/CGEMM
						
						
						
						
						
						
						
						Hip, Cuda version and vanilla CPU
One MKL stub in comments, to be tested as different. 
						
						
					 | 
					
						2023-12-21 14:01:18 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						48d1f0df89
					 | 
					
						
						
							
							Optimised partially, working
						
						
						
						
						
						
					 | 
					
						2023-12-21 12:33:47 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						b75cb7a12c
					 | 
					
						
						
							
							Blas batched partial implementation on Frontier only for now
						
						
						
						
						
						
					 | 
					
						2023-12-21 12:31:33 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						332563e037
					 | 
					
						
						
							
							Debugged, reducing verbose
						
						
						
						
						
						
					 | 
					
						2023-12-21 12:30:57 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						0cce97a4fe
					 | 
					
						
						
							
							verbosity only
						
						
						
						
						
						
					 | 
					
						2023-12-20 21:30:10 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						95a8e4be64
					 | 
					
						
						
							
							rocblas
						
						
						
						
						
						
					 | 
					
						2023-12-20 21:27:59 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						abcd6b8cb6
					 | 
					
						
						
							
							Faster version
						
						
						
						
						
						
					 | 
					
						2023-12-19 15:17:46 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						e8f21c9b6d
					 | 
					
						
						
							
							Memmory verbose control improvement
						
						
						
						
						
						
					 | 
					
						2023-12-19 15:16:58 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						e054078b11
					 | 
					
						
						
							
							Verbose
						
						
						
						
						
						
					 | 
					
						2023-12-05 16:15:17 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						6835a7f208
					 | 
					
						
						
							
							Better logging, test on 81 point stencil
						
						
						
						
						
						
					 | 
					
						2023-11-29 19:20:47 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						f59993b979
					 | 
					
						
						
							
							Nbasis§
						
						
						
						
						
						
					 | 
					
						2023-11-29 09:47:36 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						2290b8f680
					 | 
					
						
						
							
							Verbose
						
						
						
						
						
						
					 | 
					
						2023-11-29 09:47:04 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						2c54be651c
					 | 
					
						
						
							
							Further updates
						
						
						
						
						
						
					 | 
					
						2023-11-29 09:43:29 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						e859a199df
					 | 
					
						
						
							
							Reduce volume to interior for coarse stencil -- worth up to 4x gain
						
						
						
						
						
						
					 | 
					
						2023-11-28 10:23:16 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						0a3682ad0b
					 | 
					
						
						
							
							MultiRHS work
						
						
						
						
						
						
					 | 
					
						2023-11-28 07:43:37 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						59abaeb5cd
					 | 
					
						
						
							
							Time stamp
						
						
						
						
						
						
					 | 
					
						2023-11-24 12:56:45 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						3e448435d3
					 | 
					
						
						
							
							Restrict to interior
						
						
						
						
						
						
					 | 
					
						2023-11-23 18:23:29 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						a294bc3c5b
					 | 
					
						
						
							
							Relax constraints for multiRHS
						
						
						
						
						
						
					 | 
					
						2023-11-23 18:20:42 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						b302ad3d49
					 | 
					
						
						
							
							multiRHS test in place, passes Yay!
						
						
						
						
						
						
					 | 
					
						2023-11-23 18:20:15 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						82fc4b1e94
					 | 
					
						
						
							
							Finalise
						
						
						
						
						
						
					 | 
					
						2023-11-23 18:19:41 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						b4f1740380
					 | 
					
						
						
							
							Finalise message
						
						
						
						
						
						
					 | 
					
						2023-11-23 18:19:16 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						031f85247c
					 | 
					
						
						
							
							multRHS initial support -- needs optimisation for multi project/promote.
						
						
						
						
						
						
						
						Bug fix in freeing intermediate grids to stop double free 
						
						
					 | 
					
						2023-11-23 18:18:35 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						639cc6f73a
					 | 
					
						
						
							
							better support for multiRHS coarse space
						
						
						
						
						
						
						
						Still to add restriction of domain of last loop to interior of padded cell (expect about 4.5x on test volume on Crusher) 
						
						
					 | 
					
						2023-11-23 18:16:26 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						09946cf1ba
					 | 
					
						
						
							
							Improved, works on 48^3 moving to multiRHS optimisations
						
						
						
						
						
						
					 | 
					
						2023-11-15 18:03:05 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						f4fa95e7cb
					 | 
					
						
						
							
							Use 5.3.0
						
						
						
						
						
						
					 | 
					
						2023-11-15 18:01:38 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						100e29e35e
					 | 
					
						
						
							
							Allow expression as argument to norm2
						
						
						
						
						
						
					 | 
					
						2023-11-15 18:00:44 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						4cbe471a83
					 | 
					
						
						
							
							devVector
						
						
						
						
						
						
					 | 
					
						2023-11-15 18:00:07 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						8bece1f861
					 | 
					
						
						
							
							Faster to transpose the matrix and apply with column major order
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:58:38 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						a3ca71ec01
					 | 
					
						
						
							
							Lots more setup options, still working on them
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:58:04 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						e0543e8af5
					 | 
					
						
						
							
							Implement flexible preconditioned CG
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:57:39 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						c1eb80d01a
					 | 
					
						
						
							
							Print which have converged
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:57:08 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						a26121d97b
					 | 
					
						
						
							
							Better printing
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:56:45 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						043031a757
					 | 
					
						
						
							
							Report resid on failed convergence
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:56:22 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						807aeebe4c
					 | 
					
						
						
							
							Resize tol in constructor
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:55:57 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						8aa1a37aad
					 | 
					
						
						
							
							For Mirs preconditioner solver
						
						
						
						
						
						
					 | 
					
						2023-11-15 17:55:32 -05:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						4efa042f50
					 | 
					
						
						
							
							C++17 change
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:57:50 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						c7cb37e970
					 | 
					
						
						
							
							c++17 accepted
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:57:24 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						d34b207eab
					 | 
					
						
						
							
							Avoid HIP warnings
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:57:04 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						0e6fa6f6b8
					 | 
					
						
						
							
							DOn't need the Cshift for the period optimisation
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:56:31 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						38b87de53f
					 | 
					
						
						
							
							This works around a stacksize limit on AMD GPU
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:56:07 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						aa5047a9e4
					 | 
					
						
						
							
							Faster blockProject blockPromote
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:49:55 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						24b6ee0df9
					 | 
					
						
						
							
							M4 file
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:36:48 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						1e79cc9cbe
					 | 
					
						
						
							
							Avoid compiler error
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:36:09 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Peter Boyle
							
						 
					 | 
					
						
						
							
						
						b3925df9c3
					 | 
					
						
						
							
							Verbose on CPU-GPU xfer, remove performance by default
						
						
						
						
						
						
					 | 
					
						2023-10-24 10:25:01 -04:00 | 
					
					
						
						
						
							
							
							
							
							
							
						
					 |