paboyle 
							
						 
					 
					
						
						
							
						
						980ff18956 
					 
					
						
						
							
							Solving the instantiation no compile issue  
						
						
						
						
					 
					
						2016-07-15 17:19:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1a6c7204ac 
					 
					
						
						
							
							Disable instantiation; Use cache version instead  
						
						
						
						
					 
					
						2016-07-15 00:34:39 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						49310fbab3 
					 
					
						
						
							
							Done with red black change over  
						
						
						
						
					 
					
						2016-07-15 00:08:43 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						5c0c8efb9e 
					 
					
						
						
							
							Updated file list  
						
						
						
						
					 
					
						2016-07-15 00:02:11 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						dfd714e1ef 
					 
					
						
						
							
							Multiple implementations for the 5d hopping terms, depending on cache friendly  
						
						... 
						
						
						
						ops and/or the 5th direction being vectorised
All use 4d redblack. 
						
						
					 
					
						2016-07-15 00:00:09 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						79a8ca1a62 
					 
					
						
						
							
							Rewrite for performance. Impl dependent instantiations give  
						
						... 
						
						
						
						4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
   and rotate/shift of the Mooee M5D routines. 
						
						
					 
					
						2016-07-14 23:58:15 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						fb45eb2eb2 
					 
					
						
						
							
							5d ls vec rename of impl class  
						
						
						
						
					 
					
						2016-07-14 23:57:26 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a307274c96 
					 
					
						
						
							
							Fermion impl rename for ls vectorised 5d approaches  
						
						
						
						
					 
					
						2016-07-14 23:56:13 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3f2c44a5fe 
					 
					
						
						
							
							Updating the class to 5d selection based on impl type  
						
						
						
						
					 
					
						2016-07-14 23:55:26 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						48fb1cdc11 
					 
					
						
						
							
							Update domain 5d vectorised impl type, move the type over to 4d redblack with  
						
						... 
						
						
						
						the dense OO inverse 
						
						
					 
					
						2016-07-14 23:54:35 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8a79e93cc2 
					 
					
						
						
							
							Rename the 5d domain wall fermion vectorised Ls impl class  
						
						
						
						
					 
					
						2016-07-14 23:53:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						dd62a61c5c 
					 
					
						
						
							
							Added broadcast and rotation of simd vectors  
						
						
						
						
					 
					
						2016-07-14 23:49:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8f47d0b5ab 
					 
					
						
						
							
							Rotation needed for hopping term in fifth dim with Ls vectorised fields  
						
						
						
						
					 
					
						2016-07-14 23:45:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						42af132dab 
					 
					
						
						
							
							Fix for chris kellys request to peek poke on checkerboarded fields  
						
						
						
						
					 
					
						2016-07-14 23:44:48 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						adbc7c1188 
					 
					
						
						
							
							Adding files for multiple implementations (cache opt) and Ls vectorisation  
						
						... 
						
						
						
						of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision. 
						
						
					 
					
						2016-07-14 22:59:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						62601bb649 
					 
					
						
						
							
							Bug fix  
						
						
						
						
					 
					
						2016-07-08 20:46:29 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ef97e32152 
					 
					
						
						
							
							Adding persistent communicators  
						
						
						
						
					 
					
						2016-07-08 17:16:08 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a0676beeb1 
					 
					
						
						
							
							Open up dependency on Eigen and FFTW  
						
						
						
						
					 
					
						2016-07-07 22:31:07 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						fc4a043663 
					 
					
						
						
							
							Colors and banner clean up  
						
						
						
						
					 
					
						2016-07-02 16:15:38 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						680645f849 
					 
					
						
						
							
							Merge branch 'release/v0.5.0'  
						
						
						
						
					 
					
						2016-06-30 15:15:03 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						712b9a3489 
					 
					
						
						
							
							Asm only for avx512  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bdaa5b1767 
					 
					
						
						
							
							Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8fcefc021a 
					 
					
						
						
							
							Improved the prefetching when using cache blocking codes  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1445189361 
					 
					
						
						
							
							COntrol the prefetch strategy  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						05c884a62a 
					 
					
						
						
							
							Prefetch change  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a25bec87d9 
					 
					
						
						
							
							Prefetch during save  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2d8bb4c594 
					 
					
						
						
							
							Tweaks  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						51cb2d4328 
					 
					
						
						
							
							update file lists  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6d58cb2a68 
					 
					
						
						
							
							Enable reordering of the loops in the assembler for cache friendly.  
						
						... 
						
						
						
						This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching. 
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						5e02392f9c 
					 
					
						
						
							
							Fixed compilation error for benchmark_dwf  
						
						... 
						
						
						
						Some parts were assuming floating point precision 
						
						
					 
					
						2016-06-20 12:30:51 +01:00 
						 
				 
			
				
					
						
							
							
								Richard Rollins 
							
						 
					 
					
						
						
							
						
						86187d7cca 
					 
					
						
						
							
							Removed write to stdout in constructor for MPI CartesianCommunicator  
						
						
						
						
					 
					
						2016-06-14 15:34:20 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						87418e7df1 
					 
					
						
						
							
							Slightly faster prefetching perf.  
						
						
						
						
					 
					
						2016-06-13 02:32:52 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						55f65b81b5 
					 
					
						
						
							
							Improvements to the assembler interface that let us move chunks of the  
						
						... 
						
						
						
						site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work. 
						
						
					 
					
						2016-06-09 01:12:36 -07:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						d9408893b3 
					 
					
						
						
							
							Prefetching in the normal kernel implementation.  
						
						
						
						
					 
					
						2016-06-08 05:43:48 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8ac021de73 
					 
					
						
						
							
							Added a test an fixed it for red black precon Ls innermost vectorised DWF  
						
						
						
						
					 
					
						2016-06-07 13:16:56 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e503ef5590 
					 
					
						
						
							
							Cleaned up  
						
						
						
						
					 
					
						2016-06-07 00:11:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a7682b0060 
					 
					
						
						
							
							Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS  
						
						
						
						
					 
					
						2016-06-06 23:48:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d4c9d71fc8 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						
						
						
					 
					
						2016-06-06 07:06:54 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						786ca52c43 
					 
					
						
						
							
							Problems remain in the red black preconditioning of the Ls vectorisation  
						
						
						
						
					 
					
						2016-06-06 07:05:51 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f78d89bcbe 
					 
					
						
						
							
							Update Lebesgue.cc  
						
						... 
						
						
						
						kill verbose 
						
						
					 
					
						2016-06-03 13:33:42 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						53d06046b0 
					 
					
						
						
							
							Compiling updates for KNL  
						
						
						
						
					 
					
						2016-06-03 03:47:54 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						139cc5f1ae 
					 
					
						
						
							
							Large change with KNL preparation  
						
						
						
						
					 
					
						2016-06-03 03:24:26 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						1c0e922585 
					 
					
						
						
							
							Merge pull request  #35  from aportelli/master  
						
						... 
						
						
						
						empty SIMD fix 
						
						
					 
					
						2016-05-27 16:49:13 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						9d5f693cbe 
					 
					
						
						
							
							empty SIMD fix  
						
						
						
						
					 
					
						2016-05-24 10:56:27 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5c90c3b457 
					 
					
						
						
							
							Merge pull request  #34  from aportelli/master  
						
						... 
						
						
						
						Polymorphic lattices & various small updates 
						
						
					 
					
						2016-05-24 10:50:04 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						91e04056f9 
					 
					
						
						
							
							fix of the empty SIMD  
						
						
						
						
					 
					
						2016-05-12 19:24:10 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						3789e3f31c 
					 
					
						
						
							
							additional fixed in slice functions  
						
						
						
						
					 
					
						2016-05-12 18:35:38 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						0c66719210 
					 
					
						
						
							
							const fix in slice functions  
						
						
						
						
					 
					
						2016-05-12 13:01:35 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3a5b5c8bec 
					 
					
						
						
							
							Save an old tar of tree  
						
						
						
						
					 
					
						2016-05-12 03:20:17 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						4bc21ec7cb 
					 
					
						
						
							
							thread CL argument fix  
						
						
						
						
					 
					
						2016-05-11 15:21:29 +01:00