Peter Boyle 
							
						 
					 
					
						
						
							
						
						add4495a4a 
					 
					
						
						
							
							cout IO for all types  
						
						
						
						
					 
					
						2015-05-13 09:24:10 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						541d52ab97 
					 
					
						
						
							
							I have made the Cshift work successfully with open mp threading in  
						
						... 
						
						
						
						every routine. Collapse(2) is now working under clang-omp++. 
						
						
					 
					
						2015-05-13 00:31:00 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						556befaaaa 
					 
					
						
						
							
							Enhanced SIMD interfacing  
						
						
						
						
					 
					
						2015-05-12 20:41:44 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c6baa3e657 
					 
					
						
						
							
							Threading support rework.  
						
						... 
						
						
						
						Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM. 
						
						
					 
					
						2015-05-12 07:51:41 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6e6843ac69 
					 
					
						
						
							
							Moving some things around for pretty  
						
						
						
						
					 
					
						2015-05-11 19:09:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c8dc8ff891 
					 
					
						
						
							
							Adding a better controlled threading class, preparing to  
						
						... 
						
						
						
						force in deterministic reduction. 
						
						
					 
					
						2015-05-11 18:59:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b613ed0bb8 
					 
					
						
						
							
							Got command line args working  
						
						
						
						
					 
					
						2015-05-11 14:36:48 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4eb08ac9de 
					 
					
						
						
							
							CML parse  
						
						
						
						
					 
					
						2015-05-11 12:56:27 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b42453d1fd 
					 
					
						
						
							
							Command line args and a general clean up  
						
						
						
						
					 
					
						2015-05-11 12:43:10 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						06dcbed6b1 
					 
					
						
						
							
							Updated to do list  
						
						
						
						
					 
					
						2015-05-11 09:44:50 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2203c6e597 
					 
					
						
						
							
							Lots of changes required to compile for MIC under ICPC  
						
						
						
						
					 
					
						2015-05-10 23:29:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						4da2c2ea00 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						... 
						
						
						
						Conflicts:
	lib/qcd/Grid_qcd_wilson_dop.cc 
						
						
					 
					
						2015-05-10 15:37:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1ec1b4ee44 
					 
					
						
						
							
							Expression template hack  
						
						
						
						
					 
					
						2015-05-10 15:35:30 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1ab92563b9 
					 
					
						
						
							
							Expression template engin  
						
						
						
						
					 
					
						2015-05-10 15:34:20 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5a7751d9df 
					 
					
						
						
							
							Updated TODO list  
						
						
						
						
					 
					
						2015-05-10 15:32:56 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						79c51ac51f 
					 
					
						
						
							
							Hack; must bring norm2 into the unary operator list.  
						
						... 
						
						
						
						ET's are still incomplete. 
						
						
					 
					
						2015-05-10 15:30:29 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7119bce9f3 
					 
					
						
						
							
							Default to single node. Move to command line args.  
						
						
						
						
					 
					
						2015-05-10 15:27:38 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cd90f55536 
					 
					
						
						
							
							Single node default. Should expose this as command line args, but haven't sorted out  
						
						... 
						
						
						
						Grid_initialize to handle this. Should put this on the TODO list. 
						
						
					 
					
						2015-05-10 15:26:06 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dc7132af71 
					 
					
						
						
							
							Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.  
						
						... 
						
						
						
						This is a short term hack while I benchmark. 
						
						
					 
					
						2015-05-10 15:25:23 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						961fbb2718 
					 
					
						
						
							
							Assertion should never hit, but did due to a bug  
						
						
						
						
					 
					
						2015-05-10 15:24:37 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						4a8fd55f52 
					 
					
						
						
							
							Moving operator stuff into separate file so that we can switch on/off replacement with  
						
						... 
						
						
						
						expression templates 
						
						
					 
					
						2015-05-10 15:23:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e02cbaa016 
					 
					
						
						
							
							Fixing breakage in the Comms non compile  
						
						
						
						
					 
					
						2015-05-10 15:23:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						463c31ae09 
					 
					
						
						
							
							Bringing expression templates for faster vector loops  
						
						
						
						
					 
					
						2015-05-10 15:22:31 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						3657f2303d 
					 
					
						
						
							
							ET ready benchmark with bytes counted assuming loop interchange  
						
						
						
						
					 
					
						2015-05-10 15:18:04 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9ed1fb45e1 
					 
					
						
						
							
							Updated todo list  
						
						
						
						
					 
					
						2015-05-10 15:13:50 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						52403d587c 
					 
					
						
						
							
							Wilson perf improvements with Gauge prefetching  
						
						
						
						
					 
					
						2015-05-06 06:37:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cdd5cdeda2 
					 
					
						
						
							
							Cleaned up for Linux  
						
						
						
						
					 
					
						2015-05-05 22:09:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cb4b82b09f 
					 
					
						
						
							
							streaming store cases  
						
						
						
						
					 
					
						2015-05-05 18:14:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cd990ba13d 
					 
					
						
						
							
							Streaming store option  
						
						
						
						
					 
					
						2015-05-05 18:13:06 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						249165d1b2 
					 
					
						
						
							
							Added streaming stores  
						
						
						
						
					 
					
						2015-05-05 18:09:28 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b720222d98 
					 
					
						
						
							
							Updated bandwidth test  
						
						
						
						
					 
					
						2015-05-05 18:08:53 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0e8415de1b 
					 
					
						
						
							
							Added a makefile  
						
						
						
						
					 
					
						2015-05-05 17:56:42 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2b46ad38e2 
					 
					
						
						
							
							Back to vector for now; cost of init loop is clear in the a*x + y  
						
						... 
						
						
						
						loop in memory benchmark and must move to better container class. 
						
						
					 
					
						2015-05-03 09:48:13 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9d93d1e6d4 
					 
					
						
						
							
							Comms and memory benchmarks added  
						
						
						
						
					 
					
						2015-05-03 09:44:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						253362f978 
					 
					
						
						
							
							Added a comms benchmark  
						
						
						
						
					 
					
						2015-05-02 23:51:43 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ea52562527 
					 
					
						
						
							
							Added a comms benchmark  
						
						
						
						
					 
					
						2015-05-02 23:42:30 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6a39089a43 
					 
					
						
						
							
							Starting a benchmarking sub dir  
						
						
						
						
					 
					
						2015-05-02 17:52:36 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bdf18941a2 
					 
					
						
						
							
							Improving the byte swap support for portability  
						
						
						
						
					 
					
						2015-05-01 10:57:33 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d904e2b9ac 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						
						
						
					 
					
						2015-04-30 16:40:13 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c0ead94791 
					 
					
						
						
							
							Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without  
						
						... 
						
						
						
						any particular success in increasing the performance. 
						
						
					 
					
						2015-04-30 16:39:06 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7ac997bd58 
					 
					
						
						
							
							Merge pull request  #1  from mspraggs/patch-1  
						
						... 
						
						
						
						Added <map> include to GridNerscIO.h 
						
						
					 
					
						2015-04-30 09:46:48 +01:00 
						 
				 
			
				
					
						
							
							
								mspraggs 
							
						 
					 
					
						
						
							
						
						24fc71b2e9 
					 
					
						
						
							
							Added <map> include to GridNerscIO.h  
						
						... 
						
						
						
						Adding this allows clang to compile Grid to completion. 
						
						
					 
					
						2015-04-29 23:44:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d8ffa09e3b 
					 
					
						
						
							
							Benchmark wilson dhop now; 14.6GF on one core, not as fast as SU(3)xSU(3) [23GF] but still not too shabby.  
						
						... 
						
						
						
						Disassembling output shows ugly sequences in the permute sector. Could comparatively benchmark with and without
the if-else structure to see how much I'm losing.
Drops to 9GF as it falls out of cache. Moving to Lebesgue ordering should help there. Substantive progress. 
						
						
					 
					
						2015-04-29 06:50:18 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dcc23faa4a 
					 
					
						
						
							
							Fixed the stencil sector and Wilson now agrees between stencil based implementation  
						
						... 
						
						
						
						and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions 
						
						
					 
					
						2015-04-29 06:23:56 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b0485894b3 
					 
					
						
						
							
							Shaken out stencil to the point where I think wilson dslash is correct.  
						
						... 
						
						
						
						Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise. 
						
						
					 
					
						2015-04-28 08:11:59 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0b7d389258 
					 
					
						
						
							
							Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required  
						
						
						
						
					 
					
						2015-04-27 13:45:07 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						35cfef2129 
					 
					
						
						
							
							Big updates with progress towards wilson matrix  
						
						
						
						
					 
					
						2015-04-26 15:51:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c678f2d255 
					 
					
						
						
							
							Starting the implementation of wilson; incomplete and committing non-functional code which  
						
						... 
						
						
						
						is not yet included from elsewhere or linked to the build system. 
						
						
					 
					
						2015-04-25 14:33:02 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d5fd34b6e8 
					 
					
						
						
							
							Update to TODO list  
						
						
						
						
					 
					
						2015-04-25 13:04:26 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2d8cf9e456 
					 
					
						
						
							
							Added two spinor functionality required to support the Wilson hopping term.  
						
						
						
						
					 
					
						2015-04-25 12:54:06 +01:00