2b4399f8b1 
					 
					
						
						
							
							more HOST_NAME_MAX fix  
						
						
						
						
					 
					
						2024-03-07 15:26:01 +09:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						9b5f741e85 
					 
					
						
						
							
							Reproducing CG can be more useful now  
						
						
						
						
					 
					
						2024-03-06 00:03:16 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						436bf1d9d3 
					 
					
						
						
							
							Merge pull request  #455  from clarkedavida/hisq_fat_links  
						
						... 
						
						
						
						Hisq fat links 
						
						
					 
					
						2024-02-29 15:29:39 -05:00 
						 
				 
			
				
					
						
							
							
								Dennis Bollweg 
							
						 
					 
					
						
						
							
						
						b507fe209c 
					 
					
						
						
							
							Added SpinColourMatrix case to sliceSum Test  
						
						
						
						
					 
					
						2024-02-27 11:28:32 -05:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						94581e3c7a 
					 
					
						
						
							
							accelerator_for is broken  
						
						
						
						
					 
					
						2024-02-23 15:58:33 -07:00 
						 
				 
			
				
					
						
							
							
								Dennis Bollweg 
							
						 
					 
					
						
						
							
						
						15878f7613 
					 
					
						
						
							
							sliceSumReduction_cub_large now also faster than CPU on Frontier  
						
						
						
						
					 
					
						2024-02-16 13:55:21 -05:00 
						 
				 
			
				
					
						
							
							
								dbollweg 
							
						 
					 
					
						
						
							
						
						6f3455900e 
					 
					
						
						
							
							Adding sliceSumReduction_cub_small/large since hipcub cannot deal with arb. large vobjs  
						
						
						
						
					 
					
						2024-02-16 13:15:02 -05:00 
						 
				 
			
				
					
						
							
							
								dbollweg 
							
						 
					 
					
						
						
							
						
						b5659d106e 
					 
					
						
						
							
							more test cases  
						
						
						
						
					 
					
						2024-02-09 13:37:14 -05:00 
						 
				 
			
				
					
						
							
							
								dbollweg 
							
						 
					 
					
						
						
							
						
						9514035b87 
					 
					
						
						
							
							refactor slicesum: slicesum uses GPU version by default now  
						
						
						
						
					 
					
						2024-02-09 13:02:28 -05:00 
						 
				 
			
				
					
						
							
							
								dbollweg 
							
						 
					 
					
						
						
							
						
						ab2de131bd 
					 
					
						
						
							
							work towards sliceSum for sycl backend  
						
						
						
						
					 
					
						2024-02-06 13:24:45 -05:00 
						 
				 
			
				
					
						
							
							
								Dennis Bollweg 
							
						 
					 
					
						
						
							
						
						b8b9dc952d 
					 
					
						
						
							
							Async memcpy's and cleanup  
						
						
						
						
					 
					
						2024-02-01 17:55:35 -05:00 
						 
				 
			
				
					
						
							
							
								Dennis Bollweg 
							
						 
					 
					
						
						
							
						
						79a6ed32d8 
					 
					
						
						
							
							Use accelerator_for2d and DeviceSegmentedRecude to avoid kernel launch latencies  
						
						
						
						
					 
					
						2024-02-01 16:41:03 -05:00 
						 
				 
			
				
					
						
							
							
								dbollweg 
							
						 
					 
					
						
						
							
						
						caa5f97723 
					 
					
						
						
							
							Add sliceSum gpu using cub/hipcub  
						
						
						
						
					 
					
						2024-01-31 16:50:06 -05:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						4924b3209e 
					 
					
						
						
							
							projectU3 yields a unitary matrix  
						
						
						
						
					 
					
						2024-01-23 14:43:58 -07:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						f5b3d582b0 
					 
					
						
						
							
							first attempt at U3 projection  
						
						
						
						
					 
					
						2024-01-22 02:49:40 -07:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						981c93d67a 
					 
					
						
						
							
							update Test_fatLinks to accept Naik  
						
						
						
						
					 
					
						2024-01-21 21:09:19 -07:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						9cd4128833 
					 
					
						
						
							
							fix naik bug  
						
						
						
						
					 
					
						2023-11-03 14:11:38 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						df9b958c40 
					 
					
						
						
							
							naik now returns separately  
						
						
						
						
					 
					
						2023-10-30 17:40:53 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						3d3376d1a3 
					 
					
						
						
							
							LePage works, trying Naik  
						
						
						
						
					 
					
						2023-10-27 16:26:31 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						21ed6ac0f4 
					 
					
						
						
							
							added floating-point support  
						
						
						
						
					 
					
						2023-10-20 13:54:26 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						7bb8ab7000 
					 
					
						
						
							
							improve smearing templating  
						
						
						
						
					 
					
						2023-10-20 08:41:02 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						391fd9cc6a 
					 
					
						
						
							
							try lepage term  
						
						
						
						
					 
					
						2023-10-17 14:57:15 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						36600899e2 
					 
					
						
						
							
							working 7-link; Grid_log; generalShift  
						
						
						
						
					 
					
						2023-10-12 11:11:39 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						b9c70d156b 
					 
					
						
						
							
							Merge branch 'develop' into hisq_fat_links  
						
						
						
						
					 
					
						2023-10-10 22:44:17 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						eb89579fe7 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/develop' into develop  
						
						
						
						
					 
					
						2023-10-10 22:43:51 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						0cfd13d18b 
					 
					
						
						
							
							7-link working  
						
						
						
						
					 
					
						2023-10-10 22:41:52 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c5f1420dea 
					 
					
						
						
							
							Merge remote-tracking branch 'LupoA/develop' into LupoA-develop  
						
						
						
						
					 
					
						2023-10-02 16:22:35 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						018e6da872 
					 
					
						
						
							
							Merge pull request  #440  from giltirn/feature/paddedcellgauge  
						
						... 
						
						
						
						Feature/paddedcellgauge 
						
						
					 
					
						2023-10-02 10:00:42 -04:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						63d9b8e8a3 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/develop' into hisq_fat_links  
						
						
						
						
					 
					
						2023-09-16 23:20:31 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						d247031c98 
					 
					
						
						
							
							try 7-link  
						
						
						
						
					 
					
						2023-09-16 23:18:16 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b8a7004365 
					 
					
						
						
							
							Partial fraction test  
						
						
						
						
					 
					
						2023-08-14 15:17:03 -04:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						99d879ea7f 
					 
					
						
						
							
							5-link first attempt  
						
						
						
						
					 
					
						2023-08-11 22:56:30 -06:00 
						 
				 
			
				
					
						
							
							
								Julian Lenz 
							
						 
					 
					
						
						
							
						
						f7b79cdd45 
					 
					
						
						
							
							Added test for ProjectSpn  
						
						
						
						
					 
					
						2023-07-03 18:00:32 +01:00 
						 
				 
			
				
					
						
							
							
								Alessandro Lupo 
							
						 
					 
					
						
						
							
						
						b92428f05f 
					 
					
						
						
							
							better test  
						
						
						
						
					 
					
						2023-07-02 13:34:03 +01:00 
						 
				 
			
				
					
						
							
							
								Alessandro Lupo 
							
						 
					 
					
						
						
							
						
						34b11864b6 
					 
					
						
						
							
							prettiest tests  
						
						
						
						
					 
					
						2023-07-02 13:25:57 +01:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						9d263d9a7d 
					 
					
						
						
							
							fix bug in HISQSmearing; move benchmark b/c i don't understand how makefiles work  
						
						
						
						
					 
					
						2023-06-28 10:05:34 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						9015c229dc 
					 
					
						
						
							
							add benchmark to see whether matrix multiplication is slower than read from object  
						
						
						
						
					 
					
						2023-06-27 21:28:26 -06:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						f44dce390f 
					 
					
						
						
							
							Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding  
						
						... 
						
						
						
						Fixed const correctness on PaddedCell methods
Fixed compile issues on Crusher
Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance
Optimized kernel for StaplePadded
Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings 
						
						
					 
					
						2023-06-27 14:58:10 -04:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						a7eabaad56 
					 
					
						
						
							
							rudimentary appendShift convenience method, which allows the user to append an arbitrary shift in one line  
						
						
						
						
					 
					
						2023-06-26 23:59:28 -06:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						eeb4703b84 
					 
					
						
						
							
							develop wrappers to make the stencils easier to construct  
						
						
						
						
					 
					
						2023-06-26 17:45:35 -06:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						6f6844ccf1 
					 
					
						
						
							
							Added new StapleAll and RectStapleAll functions that return the staples for all mu as an array  
						
						... 
						
						
						
						Modified plaq+rectangle gauge actions to use the above
Added a test code to confirm the above changes 
						
						
					 
					
						2023-06-26 15:48:47 -04:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						4c6613d72c 
					 
					
						
						
							
							Modified RectStapleDouble and RectStapleOptimised to use Gauge-BC respecting CshiftLink  
						
						... 
						
						
						
						Added test code tests/debug/Test_optimized_staple_gaugebc demonstrating equivalence of above to RectStapleUnoptimised for cconj gauge BCs
Removed optimized staple only being used for periodic gauge BCs; it is now always used 
						
						
					 
					
						2023-06-26 10:20:23 -04:00 
						 
				 
			
				
					
						
							
							
								Alessandro Lupo 
							
						 
					 
					
						
						
							
						
						cff1f8d3b8 
					 
					
						
						
							
							rm unused variables and formatting  
						
						
						
						
					 
					
						2023-06-23 16:04:18 +01:00 
						 
				 
			
				
					
						
							
							
								Alessandro Lupo 
							
						 
					 
					
						
						
							
						
						f27d2083cd 
					 
					
						
						
							
							adjustments in SUn and Sp2n impl  
						
						
						
						
					 
					
						2023-06-23 15:34:08 +01:00 
						 
				 
			
				
					
						
							
							
								Alessandro Lupo 
							
						 
					 
					
						
						
							
						
						de30c4e22a 
					 
					
						
						
							
							minor improvements  
						
						
						
						
					 
					
						2023-06-23 10:49:41 +01:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						4241c7d4a3 
					 
					
						
						
							
							Imported coalescedReadGeneralPermute GPU implementation from Christoph  
						
						... 
						
						
						
						Fixed bug in padded staple code where extract was being called on the result before the GPU view was closed
Fixed compile issue with pointer cast in padded staple code
Added timing summaries of padded staple code and timing breakdown of staple implementation to Test_padded_cell_staple 
						
						
					 
					
						2023-06-21 16:01:01 -04:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						26b2caf570 
					 
					
						
						
							
							add template parameter to Smear_HISQ_fat for MILC interfacing  
						
						
						
						
					 
					
						2023-06-20 15:37:54 -06:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						7b11075102 
					 
					
						
						
							
							The user can now specify the implementation of Cshift used by the PaddedCell class through a virtual base class API. Implementations for default (regular Cshift) and for gauge links (which respects the gauge BCs)  
						
						... 
						
						
						
						Fixed const-correctness for PaddedCell and ConjugateGimpl::setDirections
Modified test code for padded-cell implementation of staple, rect-staple to use cconj BCs 
						
						
					 
					
						2023-06-20 17:09:56 -04:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						abc658dca5 
					 
					
						
						
							
							Added coalescedReadGeneralPermute CPU implementation based on Christoph's GPT code  
						
						... 
						
						
						
						In a test code, implemented a padded-cell version of the staple and rectangular-staple calculation 
						
						
					 
					
						2023-06-20 16:14:25 -04:00 
						 
				 
			
				
					
						
							
							
								david clarke 
							
						 
					 
					
						
						
							
						
						b61ba40023 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/develop' into develop  
						
						
						
						
					 
					
						2023-06-20 13:04:53 -06:00