paboyle 
							
						 
					 
					
						
						
							
						
						ec9939c1ba 
					 
					
						
						
							
							Test for faster implementation of meson field inner loop  
						
						... 
						
						
						
						This should be possible to cache block at outer levels, global sum across nodes not performed
and deferred to caller to block them all into a big all reduce.
Nc=3 and Fermion is hard coded in an ugly way. We might think about benchmarking whether
a product without the conjugate should be made available by Grid.
It is not clear whether the explicit unroll, or the performing of conjugate on left once
was the real source of the speed up.
Gives 70-80 GF/s on my laptop (single) half that double, and 70GB/s to cache.
This is competitive with dslash and a reasonable stopping point for the optimisation. If necessary we can revisit. 
						
						
					 
					
						2018-07-10 12:38:51 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						bfbf2f1fa0 
					 
					
						
						
							
							no threaded stencil benchmark if OpenMP is not supported  
						
						
						
						
					 
					
						2018-05-03 16:20:01 +01:00 
						 
				 
			
				
					
						
							
							
								Dr Peter Boyle 
							
						 
					 
					
						
						
							
						
						1dddd17e3c 
					 
					
						
						
							
							Benchmark improvements from tesseract  
						
						
						
						
					 
					
						2018-04-27 11:44:46 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						fa0d8feff4 
					 
					
						
						
							
							Performance of CovariantCshift now non-embarrassing.  
						
						
						
						
					 
					
						2018-04-26 17:56:27 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						05b44aef6b 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						... 
						
						
						
						Conflicts:
	benchmarks/Benchmark_su3.cc 
						
						
					 
					
						2018-04-26 15:38:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						91a0a3f820 
					 
					
						
						
							
							Improvement  
						
						
						
						
					 
					
						2018-04-26 14:48:35 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8f44c799a6 
					 
					
						
						
							
							Saving the benchmarking tests for Cshift  
						
						
						
						
					 
					
						2018-04-26 14:48:03 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						43f5a0df50 
					 
					
						
						
							
							More timers in the integrator  
						
						
						
						
					 
					
						2018-04-26 12:01:56 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2baf193031 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2018-04-25 00:14:03 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						362ba0443a 
					 
					
						
						
							
							Cshift updates  
						
						
						
						
					 
					
						2018-04-25 00:12:11 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						c5b9147b53 
					 
					
						
						
							
							Correction of a minor bug in the su3 benchmark  
						
						
						
						
					 
					
						2018-04-24 08:03:57 -07:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						a1be533329 
					 
					
						
						
							
							Corrected Flop count in Benchmark su3 and expanded the Wilson flow output  
						
						
						
						
					 
					
						2018-04-24 01:19:53 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b5510427f9 
					 
					
						
						
							
							physical fermion interface, cshift benchmark in SU3.  
						
						
						
						
					 
					
						2018-04-18 01:43:29 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						276f113f28 
					 
					
						
						
							
							IO uses master boss node for metadata.  
						
						
						
						
					 
					
						2018-03-30 16:17:05 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ab6afd18ac 
					 
					
						
						
							
							Still compile if no LIME  
						
						
						
						
					 
					
						2018-03-30 13:39:20 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						c5a885dcd6 
					 
					
						
						
							
							I/O benchmark  
						
						
						
						
					 
					
						2018-03-29 19:57:41 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						fb24e3a7d2 
					 
					
						
						
							
							Adding utilities for perf profiling  
						
						
						
						
					 
					
						2018-01-29 11:11:45 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						cff3bae155 
					 
					
						
						
							
							Adding support for general Nc in the benchmark outputs  
						
						
						
						
					 
					
						2018-01-25 13:46:31 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9b32d51cd1 
					 
					
						
						
							
							Simplify comms layer proliferatoin  
						
						
						
						
					 
					
						2018-01-08 11:27:14 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4f8b6f26b4 
					 
					
						
						
							
							Merge branch 'develop' into feature/dwf-multirhs  
						
						
						
						
					 
					
						2017-10-02 11:41:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bfb68e6f02 
					 
					
						
						
							
							Merge pull request  #130  from giltirn/gparity-handunroll  
						
						... 
						
						
						
						Gparity handunroll 
						
						
					 
					
						2017-09-21 10:11:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						17c5b0f152 
					 
					
						
						
							
							Patching comparison point  
						
						
						
						
					 
					
						2017-09-16 18:18:07 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b331be9101 
					 
					
						
						
							
							Better reporting  
						
						
						
						
					 
					
						2017-08-31 11:32:57 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						49c20a9fa8 
					 
					
						
						
							
							Patch to reporting  
						
						
						
						
					 
					
						2017-08-31 11:32:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7359df3501 
					 
					
						
						
							
							Full reporting for benchmark; save robustness factor  
						
						
						
						
					 
					
						2017-08-31 10:42:35 +01:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						d36d2fb40d 
					 
					
						
						
							
							Added ability to override default Ls in Benchmark_dwf  
						
						
						
						
					 
					
						2017-08-28 06:53:56 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5b9267e88d 
					 
					
						
						
							
							Cleaner comms benchmark treatment for one node runs  
						
						
						
						
					 
					
						2017-08-27 18:24:48 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						15fd4003ef 
					 
					
						
						
							
							Improving presentation of results  
						
						
						
						
					 
					
						2017-08-27 13:46:02 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ad89abb018 
					 
					
						
						
							
							Fix  
						
						
						
						
					 
					
						2017-08-25 20:43:37 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						80c5bce5bb 
					 
					
						
						
							
							Merge branch 'develop' into feature/multi-communicator  
						
						
						
						
					 
					
						2017-08-25 20:21:26 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d0f3d525d5 
					 
					
						
						
							
							Optimal block size for KNL  
						
						
						
						
					 
					
						2017-08-25 19:33:54 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						3a58217405 
					 
					
						
						
							
							Updated  
						
						
						
						
					 
					
						2017-08-25 14:29:53 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c289699d9a 
					 
					
						
						
							
							updated from cambridge mpi3 shakeout  
						
						
						
						
					 
					
						2017-08-25 11:41:01 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c3b1263e75 
					 
					
						
						
							
							Benchmark prep  
						
						
						
						
					 
					
						2017-08-25 09:25:54 +01:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						edabb3577f 
					 
					
						
						
							
							Imported Benchmark_gparity  
						
						
						
						
					 
					
						2017-08-23 16:54:06 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ae56e556c6 
					 
					
						
						
							
							finalise issue on new OPA revert  
						
						
						
						
					 
					
						2017-08-20 02:53:12 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						383ca7d392 
					 
					
						
						
							
							Switch off comms for now until feature/multi-communicator is merged  
						
						
						
						
					 
					
						2017-08-20 01:27:48 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a446d95c33 
					 
					
						
						
							
							Trying to pass TeamCity and Travis  
						
						
						
						
					 
					
						2017-08-20 01:10:50 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						be66e7dd95 
					 
					
						
						
							
							Merge branch 'develop' into feature/multi-communicator  
						
						
						
						
					 
					
						2017-08-19 23:12:38 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bfef525ed2 
					 
					
						
						
							
							New benchmark prep  
						
						
						
						
					 
					
						2017-08-19 23:10:12 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7d88198387 
					 
					
						
						
							
							Merge branch 'develop' into feature/multi-communicator  
						
						
						
						
					 
					
						2017-08-19 13:03:35 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9e658de238 
					 
					
						
						
							
							Use Vector  
						
						
						
						
					 
					
						2017-08-19 12:52:44 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						14d53e1c9e 
					 
					
						
						
							
							Threaded MPI calls patches  
						
						
						
						
					 
					
						2017-07-29 13:08:10 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						40e119c61c 
					 
					
						
						
							
							NUMA improvements worth preserving from AMD EPYC tests  
						
						
						
						
					 
					
						2017-07-08 22:27:11 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b73bd151bb 
					 
					
						
						
							
							Switch off counters by default  
						
						
						
						
					 
					
						2017-06-30 10:16:35 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						694b305cab 
					 
					
						
						
							
							Update to reporting  
						
						
						
						
					 
					
						2017-06-30 10:16:13 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6f5a5cd9b3 
					 
					
						
						
							
							Improved threaded comms benchmark  
						
						
						
						
					 
					
						2017-06-28 23:27:02 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						08e04b9676 
					 
					
						
						
							
							Better benchmarks  
						
						
						
						
					 
					
						2017-06-28 15:30:06 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						54e94360ad 
					 
					
						
						
							
							Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit  
						
						
						
						
					 
					
						2017-06-24 23:10:24 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6ebf9f15b7 
					 
					
						
						
							
							Splitting communicators first cut  
						
						
						
						
					 
					
						2017-06-22 08:14:34 +01:00