fionnoh 
							
						 
					 
					
						
						
							
						
						24128ff109 
					 
					
						
						
							
							Changes needed for MF benchmark to work with comms correctly  
						
						
						
						
					 
					
						2018-07-23 15:51:37 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						21a1710b43 
					 
					
						
						
							
							Verbose vector length  
						
						
						
						
					 
					
						2018-07-23 06:08:39 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ec9939c1ba 
					 
					
						
						
							
							Test for faster implementation of meson field inner loop  
						
						... 
						
						
						
						This should be possible to cache block at outer levels, global sum across nodes not performed
and deferred to caller to block them all into a big all reduce.
Nc=3 and Fermion is hard coded in an ugly way. We might think about benchmarking whether
a product without the conjugate should be made available by Grid.
It is not clear whether the explicit unroll, or the performing of conjugate on left once
was the real source of the speed up.
Gives 70-80 GF/s on my laptop (single) half that double, and 70GB/s to cache.
This is competitive with dslash and a reasonable stopping point for the optimisation. If necessary we can revisit. 
						
						
					 
					
						2018-07-10 12:38:51 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						4b04ae3611 
					 
					
						
						
							
							Printing improvement  
						
						
						
						
					 
					
						2018-07-05 06:59:38 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2f776d51c6 
					 
					
						
						
							
							Gpu specific benchmark saturates memory. Can enhance Grid to do this for expressions,  
						
						... 
						
						
						
						but a bitof (known) work. 
						
						
					 
					
						2018-07-05 06:58:37 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						25becc9324 
					 
					
						
						
							
							GPU tweaks for benchmarking; really necessary?  
						
						
						
						
					 
					
						2018-06-13 20:26:07 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						eb921041d0 
					 
					
						
						
							
							Perf count control  
						
						
						
						
					 
					
						2018-05-12 17:57:32 -04:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						bfbf2f1fa0 
					 
					
						
						
							
							no threaded stencil benchmark if OpenMP is not supported  
						
						
						
						
					 
					
						2018-05-03 16:20:01 +01:00 
						 
				 
			
				
					
						
							
							
								Dr Peter Boyle 
							
						 
					 
					
						
						
							
						
						1dddd17e3c 
					 
					
						
						
							
							Benchmark improvements from tesseract  
						
						
						
						
					 
					
						2018-04-27 11:44:46 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						fa0d8feff4 
					 
					
						
						
							
							Performance of CovariantCshift now non-embarrassing.  
						
						
						
						
					 
					
						2018-04-26 17:56:27 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						05b44aef6b 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						... 
						
						
						
						Conflicts:
	benchmarks/Benchmark_su3.cc 
						
						
					 
					
						2018-04-26 15:38:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						91a0a3f820 
					 
					
						
						
							
							Improvement  
						
						
						
						
					 
					
						2018-04-26 14:48:35 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8f44c799a6 
					 
					
						
						
							
							Saving the benchmarking tests for Cshift  
						
						
						
						
					 
					
						2018-04-26 14:48:03 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						43f5a0df50 
					 
					
						
						
							
							More timers in the integrator  
						
						
						
						
					 
					
						2018-04-26 12:01:56 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2baf193031 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2018-04-25 00:14:03 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						362ba0443a 
					 
					
						
						
							
							Cshift updates  
						
						
						
						
					 
					
						2018-04-25 00:12:11 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						c5b9147b53 
					 
					
						
						
							
							Correction of a minor bug in the su3 benchmark  
						
						
						
						
					 
					
						2018-04-24 08:03:57 -07:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						a1be533329 
					 
					
						
						
							
							Corrected Flop count in Benchmark su3 and expanded the Wilson flow output  
						
						
						
						
					 
					
						2018-04-24 01:19:53 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b5510427f9 
					 
					
						
						
							
							physical fermion interface, cshift benchmark in SU3.  
						
						
						
						
					 
					
						2018-04-18 01:43:29 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						276f113f28 
					 
					
						
						
							
							IO uses master boss node for metadata.  
						
						
						
						
					 
					
						2018-03-30 16:17:05 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ab6afd18ac 
					 
					
						
						
							
							Still compile if no LIME  
						
						
						
						
					 
					
						2018-03-30 13:39:20 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						c5a885dcd6 
					 
					
						
						
							
							I/O benchmark  
						
						
						
						
					 
					
						2018-03-29 19:57:41 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6fe9b28a82 
					 
					
						
						
							
							Cosmetic  
						
						
						
						
					 
					
						2018-03-24 19:27:14 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b002587d7c 
					 
					
						
						
							
							Simplify  
						
						
						
						
					 
					
						2018-03-24 19:26:44 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6c08385782 
					 
					
						
						
							
							Simplify  
						
						
						
						
					 
					
						2018-03-24 19:26:19 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a3690071b4 
					 
					
						
						
							
							Warm up GPu  
						
						
						
						
					 
					
						2018-03-22 18:05:20 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5ac96dbdc6 
					 
					
						
						
							
							Warm behaviour in SU3 benchmark  
						
						
						
						
					 
					
						2018-03-20 07:18:31 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						aead94e9a7 
					 
					
						
						
							
							View introduced  
						
						
						
						
					 
					
						2018-03-04 16:39:29 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						36ea5f6b77 
					 
					
						
						
							
							gpu friendly coordinates ; no std::vector on GPU  
						
						
						
						
					 
					
						2018-02-24 22:20:14 +00:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						fb24e3a7d2 
					 
					
						
						
							
							Adding utilities for perf profiling  
						
						
						
						
					 
					
						2018-01-29 11:11:45 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						604c05f4b8 
					 
					
						
						
							
							parallel_for elimination -> thread_loop  
						
						
						
						
					 
					
						2018-01-28 01:01:36 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ce4da83bc2 
					 
					
						
						
							
							Zero changes, literally  
						
						
						
						
					 
					
						2018-01-27 23:51:10 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c4f82e072b 
					 
					
						
						
							
							_grid becomes private ; use Grid()§  
						
						
						
						
					 
					
						2018-01-27 00:04:12 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2a4a0e43c1 
					 
					
						
						
							
							Hide internals  
						
						
						
						
					 
					
						2018-01-26 23:08:27 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f4010023ca 
					 
					
						
						
							
							Warning fixes  
						
						
						
						
					 
					
						2018-01-25 23:46:47 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e7cba358c2 
					 
					
						
						
							
							Temporary update to reflect the new dropping of std::vector in Lattice  
						
						... 
						
						
						
						Will update again to hide the internals in an interface 
						
						
					 
					
						2018-01-25 23:31:41 +00:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						cff3bae155 
					 
					
						
						
							
							Adding support for general Nc in the benchmark outputs  
						
						
						
						
					 
					
						2018-01-25 13:46:31 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						918c105c57 
					 
					
						
						
							
							NVCC warning elimination  
						
						
						
						
					 
					
						2018-01-24 13:23:59 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d74c21a386 
					 
					
						
						
							
							GLobal edit for QCD namespace removal & NAMESPACE macros  
						
						
						
						
					 
					
						2018-01-15 09:37:58 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9b32d51cd1 
					 
					
						
						
							
							Simplify comms layer proliferatoin  
						
						
						
						
					 
					
						2018-01-08 11:27:14 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4f8b6f26b4 
					 
					
						
						
							
							Merge branch 'develop' into feature/dwf-multirhs  
						
						
						
						
					 
					
						2017-10-02 11:41:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bfb68e6f02 
					 
					
						
						
							
							Merge pull request  #130  from giltirn/gparity-handunroll  
						
						... 
						
						
						
						Gparity handunroll 
						
						
					 
					
						2017-09-21 10:11:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						17c5b0f152 
					 
					
						
						
							
							Patching comparison point  
						
						
						
						
					 
					
						2017-09-16 18:18:07 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b331be9101 
					 
					
						
						
							
							Better reporting  
						
						
						
						
					 
					
						2017-08-31 11:32:57 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						49c20a9fa8 
					 
					
						
						
							
							Patch to reporting  
						
						
						
						
					 
					
						2017-08-31 11:32:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7359df3501 
					 
					
						
						
							
							Full reporting for benchmark; save robustness factor  
						
						
						
						
					 
					
						2017-08-31 10:42:35 +01:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						d36d2fb40d 
					 
					
						
						
							
							Added ability to override default Ls in Benchmark_dwf  
						
						
						
						
					 
					
						2017-08-28 06:53:56 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5b9267e88d 
					 
					
						
						
							
							Cleaner comms benchmark treatment for one node runs  
						
						
						
						
					 
					
						2017-08-27 18:24:48 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						15fd4003ef 
					 
					
						
						
							
							Improving presentation of results  
						
						
						
						
					 
					
						2017-08-27 13:46:02 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ad89abb018 
					 
					
						
						
							
							Fix  
						
						
						
						
					 
					
						2017-08-25 20:43:37 +01:00