Peter Boyle 
							
						 
					 
					
						
						
							
						
						58d32a4d0e 
					 
					
						
						
							
							Assertion should never hit, but did due to a bug  
						
						
						
						
					 
					
						2015-05-10 15:24:37 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6bb17502f9 
					 
					
						
						
							
							Moving operator stuff into separate file so that we can switch on/off replacement with  
						
						... 
						
						
						
						expression templates 
						
						
					 
					
						2015-05-10 15:23:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8299bc39ea 
					 
					
						
						
							
							Fixing breakage in the Comms non compile  
						
						
						
						
					 
					
						2015-05-10 15:23:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7f04b85368 
					 
					
						
						
							
							Bringing expression templates for faster vector loops  
						
						
						
						
					 
					
						2015-05-10 15:22:31 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a115f3b086 
					 
					
						
						
							
							ET ready benchmark with bytes counted assuming loop interchange  
						
						
						
						
					 
					
						2015-05-10 15:18:04 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						27c2d13968 
					 
					
						
						
							
							Updated todo list  
						
						
						
						
					 
					
						2015-05-10 15:13:50 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5415180676 
					 
					
						
						
							
							Wilson perf improvements with Gauge prefetching  
						
						
						
						
					 
					
						2015-05-06 06:37:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7b0dd6c5d6 
					 
					
						
						
							
							Cleaned up for Linux  
						
						
						
						
					 
					
						2015-05-05 22:09:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cb4b82b09f 
					 
					
						
						
							
							streaming store cases  
						
						
						
						
					 
					
						2015-05-05 18:14:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cd990ba13d 
					 
					
						
						
							
							Streaming store option  
						
						
						
						
					 
					
						2015-05-05 18:13:06 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						249165d1b2 
					 
					
						
						
							
							Added streaming stores  
						
						
						
						
					 
					
						2015-05-05 18:09:28 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b720222d98 
					 
					
						
						
							
							Updated bandwidth test  
						
						
						
						
					 
					
						2015-05-05 18:08:53 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0e8415de1b 
					 
					
						
						
							
							Added a makefile  
						
						
						
						
					 
					
						2015-05-05 17:56:42 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2b46ad38e2 
					 
					
						
						
							
							Back to vector for now; cost of init loop is clear in the a*x + y  
						
						... 
						
						
						
						loop in memory benchmark and must move to better container class. 
						
						
					 
					
						2015-05-03 09:48:13 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9d93d1e6d4 
					 
					
						
						
							
							Comms and memory benchmarks added  
						
						
						
						
					 
					
						2015-05-03 09:44:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						253362f978 
					 
					
						
						
							
							Added a comms benchmark  
						
						
						
						
					 
					
						2015-05-02 23:51:43 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ea52562527 
					 
					
						
						
							
							Added a comms benchmark  
						
						
						
						
					 
					
						2015-05-02 23:42:30 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6a39089a43 
					 
					
						
						
							
							Starting a benchmarking sub dir  
						
						
						
						
					 
					
						2015-05-02 17:52:36 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bdf18941a2 
					 
					
						
						
							
							Improving the byte swap support for portability  
						
						
						
						
					 
					
						2015-05-01 10:57:33 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d904e2b9ac 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						
						
						
					 
					
						2015-04-30 16:40:13 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c0ead94791 
					 
					
						
						
							
							Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without  
						
						... 
						
						
						
						any particular success in increasing the performance. 
						
						
					 
					
						2015-04-30 16:39:06 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7ac997bd58 
					 
					
						
						
							
							Merge pull request  #1  from mspraggs/patch-1  
						
						... 
						
						
						
						Added <map> include to GridNerscIO.h 
						
						
					 
					
						2015-04-30 09:46:48 +01:00 
						 
				 
			
				
					
						
							
							
								mspraggs 
							
						 
					 
					
						
						
							
						
						24fc71b2e9 
					 
					
						
						
							
							Added <map> include to GridNerscIO.h  
						
						... 
						
						
						
						Adding this allows clang to compile Grid to completion. 
						
						
					 
					
						2015-04-29 23:44:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d8ffa09e3b 
					 
					
						
						
							
							Benchmark wilson dhop now; 14.6GF on one core, not as fast as SU(3)xSU(3) [23GF] but still not too shabby.  
						
						... 
						
						
						
						Disassembling output shows ugly sequences in the permute sector. Could comparatively benchmark with and without
the if-else structure to see how much I'm losing.
Drops to 9GF as it falls out of cache. Moving to Lebesgue ordering should help there. Substantive progress. 
						
						
					 
					
						2015-04-29 06:50:18 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dcc23faa4a 
					 
					
						
						
							
							Fixed the stencil sector and Wilson now agrees between stencil based implementation  
						
						... 
						
						
						
						and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions 
						
						
					 
					
						2015-04-29 06:23:56 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b0485894b3 
					 
					
						
						
							
							Shaken out stencil to the point where I think wilson dslash is correct.  
						
						... 
						
						
						
						Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise. 
						
						
					 
					
						2015-04-28 08:11:59 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0b7d389258 
					 
					
						
						
							
							Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required  
						
						
						
						
					 
					
						2015-04-27 13:45:07 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						35cfef2129 
					 
					
						
						
							
							Big updates with progress towards wilson matrix  
						
						
						
						
					 
					
						2015-04-26 15:51:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c678f2d255 
					 
					
						
						
							
							Starting the implementation of wilson; incomplete and committing non-functional code which  
						
						... 
						
						
						
						is not yet included from elsewhere or linked to the build system. 
						
						
					 
					
						2015-04-25 14:33:02 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d5fd34b6e8 
					 
					
						
						
							
							Update to TODO list  
						
						
						
						
					 
					
						2015-04-25 13:04:26 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2d8cf9e456 
					 
					
						
						
							
							Added two spinor functionality required to support the Wilson hopping term.  
						
						
						
						
					 
					
						2015-04-25 12:54:06 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dc970c6442 
					 
					
						
						
							
							Dirac done ; remove from TODO  
						
						
						
						
					 
					
						2015-04-24 22:56:37 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						fc32450360 
					 
					
						
						
							
							Improved the gamma quite a bit.  
						
						... 
						
						
						
						Serial rng's which are set on node zero and broadcaste 
						
						
					 
					
						2015-04-24 20:21:40 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2a67214f9d 
					 
					
						
						
							
							static names and enum list  
						
						
						
						
					 
					
						2015-04-24 19:12:14 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						71d5927a66 
					 
					
						
						
							
							Vectors now too and right multiple of matrix with gamma  
						
						
						
						
					 
					
						2015-04-24 19:08:29 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f2ac20e7ab 
					 
					
						
						
							
							Removed summation  
						
						
						
						
					 
					
						2015-04-24 18:42:44 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						750dd5f5fd 
					 
					
						
						
							
							Cleared the code out from Grid_summation to lattice/Grid_lattice_transfer.h  
						
						
						
						
					 
					
						2015-04-24 18:41:34 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						74432432b6 
					 
					
						
						
							
							Moved code from summation into transfer and reduction  
						
						
						
						
					 
					
						2015-04-24 18:40:44 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b8eef54fa7 
					 
					
						
						
							
							First implementation of Dirac matrices as a Gamma class.  
						
						
						
						
					 
					
						2015-04-24 18:20:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e2e3ea5742 
					 
					
						
						
							
							Reorganised the TODO. Really getting somewhere  
						
						
						
						
					 
					
						2015-04-23 20:42:30 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						4b4dcc4c13 
					 
					
						
						
							
							Rename Grid_QCD  
						
						
						
						
					 
					
						2015-04-23 20:42:09 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						afe6c4f64f 
					 
					
						
						
							
							move  
						
						
						
						
					 
					
						2015-04-23 20:41:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						62e8d2d127 
					 
					
						
						
							
							Slice summation working. May move this into lattice/Grid_lattice_reduction however  
						
						
						
						
					 
					
						2015-04-23 15:13:00 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b7416d79e3 
					 
					
						
						
							
							Begginings of slice summation and subblocking  
						
						
						
						
					 
					
						2015-04-23 11:04:59 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2f8431ab03 
					 
					
						
						
							
							Consolidate index to coor in a single routine  
						
						
						
						
					 
					
						2015-04-23 11:04:19 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a9e574dd27 
					 
					
						
						
							
							Snippets from Guido to optimise Reduce  
						
						
						
						
					 
					
						2015-04-23 08:31:40 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						73c0db82d5 
					 
					
						
						
							
							Better description of Intel's many ISA targets  
						
						
						
						
					 
					
						2015-04-23 08:02:51 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						eb58297a43 
					 
					
						
						
							
							Fixing endian on linux I hope  
						
						
						
						
					 
					
						2015-04-23 07:51:15 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1851327d19 
					 
					
						
						
							
							Got the NERSC IO working and fixed a bug in cshift.  
						
						
						
						
					 
					
						2015-04-22 22:46:48 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a5b0c492d7 
					 
					
						
						
							
							Rework of RNG to use C++11 random. Should work correctly maintaining parallel RNG across  
						
						... 
						
						
						
						a machine. If a "fixedSeed" is used, randoms should be reproducible across different machine
decomposition since the generators are physically indexed and assigned in lexico ordering. 
						
						
					 
					
						2015-04-19 14:55:58 +01:00