Peter Boyle 
							
						 
					 
					
						
						
							
						
						3d014864e2 
					 
					
						
						
							
							Makinig LLVM happy  
						
						
						
						
					 
					
						2025-03-06 14:19:25 -05:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						0baaddbe98 
					 
					
						
						
							
							Pipeline mode commit on Aurora. 5+ TF/s on 16^3x32 per tile at 384  
						
						... 
						
						
						
						nodes.
More concurrency/fine grained scheduling is possible. 
						
						
					 
					
						2025-02-04 19:27:26 +00:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						c4fc972fec 
					 
					
						
						
							
							Merge branch 'feature/deprecate-uvm' into develop  
						
						
						
						
					 
					
						2025-01-31 16:32:36 +00:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						8cf809e231 
					 
					
						
						
							
							Best results on Aurora so far  
						
						
						
						
					 
					
						2025-01-31 16:14:45 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5ae77876a8 
					 
					
						
						
							
							Meson field and Aslash field on GPU; some compiler warning removed  
						
						
						
						
					 
					
						2024-10-18 19:08:06 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f7d4be8d96 
					 
					
						
						
							
							Calculate bytes correctly  
						
						
						
						
					 
					
						2024-09-26 14:04:44 -04:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						066544281f 
					 
					
						
						
							
							Deprecate UVM  
						
						
						
						
					 
					
						2024-09-17 13:34:27 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						53573d7d94 
					 
					
						
						
							
							Better benchmark  
						
						
						
						
					 
					
						2024-08-20 14:31:57 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f8f408e7a9 
					 
					
						
						
							
							BLAS everywhere  
						
						
						
						
					 
					
						2024-07-25 18:09:02 +00:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						517822fdd2 
					 
					
						
						
							
							SPR HBM benchmarking right and also PVC batched GEMM  
						
						
						
						
					 
					
						2024-03-06 00:02:27 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c805f86343 
					 
					
						
						
							
							USQCD benchmark  
						
						
						
						
					 
					
						2024-03-01 00:05:04 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						04ca065281 
					 
					
						
						
							
							Only one rank opens  
						
						
						
						
					 
					
						2024-02-29 20:09:11 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						88d8fa43d7 
					 
					
						
						
							
							Benchmark development  
						
						
						
						
					 
					
						2024-02-29 20:01:44 -05:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						303b83cdb8 
					 
					
						
						
							
							Scaling benchmarks, verbosity and MPICH aware in acceleratorInit()  
						
						... 
						
						
						
						For some reason Dirichlet benchmark fails on several nodes; need to
debug this. 
						
						
					 
					
						2024-02-13 19:48:03 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						14643c0aab 
					 
					
						
						
							
							SDCC benchmarking scripts for A100 nodes and IceLake nodes (AVX512)  
						
						
						
						
					 
					
						2023-12-04 15:45:57 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						86dac5ff4f 
					 
					
						
						
							
							Better printing  
						
						
						
						
					 
					
						2023-04-04 07:42:19 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						900e01f49b 
					 
					
						
						
							
							Temporary  
						
						
						
						
					 
					
						2023-03-27 21:35:06 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						23298acb81 
					 
					
						
						
							
							Merge pull request  #424  from giltirn/feature/dirichlet-precchange  
						
						... 
						
						
						
						Precision change implementation 
						
						
					 
					
						2023-03-22 23:04:52 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b5b759df73 
					 
					
						
						
							
							Merge branch 'develop' into feature/dirichlet  
						
						
						
						
					 
					
						2023-03-21 16:05:46 -04:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						1db58a8acc 
					 
					
						
						
							
							Precision change improvements  
						
						... 
						
						
						
						Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation.
In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes.
Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces
Renamed the original precisionChange as precisionChangeOrig
Fixed incorrect pointer offset bug in copyLane
Added a test and a benchmark for precisionChange
Added a test for reliable update CG 
						
						
					 
					
						2023-02-21 10:52:42 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						67f569354e 
					 
					
						
						
							
							Partial dirichlet changes  
						
						
						
						
					 
					
						2022-11-30 15:51:13 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						fe6e8f5ac6 
					 
					
						
						
							
							Benchmark_comms fix  
						
						
						
						
					 
					
						2022-11-15 17:00:49 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0ae0e5f436 
					 
					
						
						
							
							Partial Dirichlet test  
						
						
						
						
					 
					
						2022-11-15 16:40:38 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						653039695b 
					 
					
						
						
							
							Partial dirichlet changes  
						
						
						
						
					 
					
						2022-11-15 16:37:15 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c82b164f6b 
					 
					
						
						
							
							Merge branch 'feature/dirichlet' of  https://github.com/paboyle/Grid  into feature/dirichlet  
						
						
						
						
					 
					
						2022-10-04 17:41:48 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						413312f9a9 
					 
					
						
						
							
							Benchmark the halo construction.  
						
						... 
						
						
						
						THe bye counts are out and should be doubled for SIMD directions 
						
						
					 
					
						2022-10-04 11:12:59 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						25df2d2c3b 
					 
					
						
						
							
							Various precision options  
						
						
						
						
					 
					
						2022-09-27 10:57:12 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cd5cf6d614 
					 
					
						
						
							
							Tracing replaces self timing hooks  
						
						
						
						
					 
					
						2022-08-31 17:33:41 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c0f8482402 
					 
					
						
						
							
							Remove SSC marks  
						
						
						
						
					 
					
						2022-07-07 17:49:36 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						583f7c52f3 
					 
					
						
						
							
							SSC mark  
						
						
						
						
					 
					
						2022-06-01 19:27:29 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						58a86c9164 
					 
					
						
						
							
							SSC mark removal  
						
						
						
						
					 
					
						2022-06-01 19:27:06 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						18028f4309 
					 
					
						
						
							
							Merge branch 'develop' into feature/dirichlet  
						
						
						
						
					 
					
						2022-05-24 18:26:18 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						aa008cbe99 
					 
					
						
						
							
							Updated for new Dirichlet interface  
						
						
						
						
					 
					
						2022-05-19 16:44:39 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						4b1997e2f3 
					 
					
						
						
							
							wilson sweep test  
						
						
						
						
					 
					
						2022-05-16 15:58:33 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						8939d5dc73 
					 
					
						
						
							
							bugfix: eo operator called in correct location  
						
						
						
						
					 
					
						2022-05-16 00:28:28 +01:00 
						 
				 
			
				
					
						
							
							
								Christoph Lehner 
							
						 
					 
					
						
						
							
						
						e2fc3a0f04 
					 
					
						
						
							
							Merge pull request  #28  from paboyle/develop  
						
						... 
						
						
						
						Sync with Upstream 
						
						
					 
					
						2022-03-08 09:58:51 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5340e50427 
					 
					
						
						
							
							HMC running with new formulation  
						
						
						
						
					 
					
						2022-03-01 17:10:25 -05:00 
						 
				 
			
				
					
						
							
							
								Christoph Lehner 
							
						 
					 
					
						
						
							
						
						9616811c3d 
					 
					
						
						
							
							Merge branch 'feature/gpt' of  https://github.com/lehner/Grid  into feature/gpt  
						
						
						
						
					 
					
						2022-02-24 22:03:05 +01:00 
						 
				 
			
				
					
						
							
							
								Christoph Lehner 
							
						 
					 
					
						
						
							
						
						8a3002c03b 
					 
					
						
						
							
							separate left and right masses for CayleyFermion5D  
						
						
						
						
					 
					
						2022-02-24 22:02:56 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0f1c5b08a1 
					 
					
						
						
							
							Dirichlet filters running on AMD and now integrated in Fermion op  
						
						
						
						
					 
					
						2022-02-23 19:29:28 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						70988e43d2 
					 
					
						
						
							
							Passes multinode dirichlet test with boundaries at  
						
						... 
						
						
						
						node boundary or at the single rank boundary 
						
						
					 
					
						2022-02-23 01:42:14 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						aab3bcb46f 
					 
					
						
						
							
							Dirichlet first cut - wrong answers on dagger multiply.  
						
						... 
						
						
						
						Struggling to get a compute node so changing systems 
						
						
					 
					
						2022-02-22 19:58:33 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						135808dcfa 
					 
					
						
						
							
							Less verbose  
						
						
						
						
					 
					
						2021-12-07 16:24:24 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2bf3b4d576 
					 
					
						
						
							
							Update to reduce memory footpring in benchmark test  
						
						
						
						
					 
					
						2021-12-07 09:02:02 -08:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ba7e371b90 
					 
					
						
						
							
							Warning free compile on Tursa.  
						
						... 
						
						
						
						Hopefully got all reqd virtual dtors 
						
						
					 
					
						2021-10-21 19:56:52 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8bd70ad8b5 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2021-09-16 10:22:38 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b4690e6091 
					 
					
						
						
							
							Adding build basics for different systems  
						
						
						
						
					 
					
						2021-09-16 00:00:38 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c7baeb5bae 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2021-09-14 08:31:11 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						361bb8a101 
					 
					
						
						
							
							Remove half prec comms  
						
						
						
						
					 
					
						2021-09-14 15:06:29 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7efdb3cd2b 
					 
					
						
						
							
							Remove half prec comms  
						
						
						
						
					 
					
						2021-09-14 15:06:06 +01:00