paboyle 
							
						 
					 
					
						
						
							
						
						4e7ab3166f 
					 
					
						
						
							
							Refactoring header layout  
						
						
						
						
					 
					
						2017-02-22 18:09:33 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3ae92fa2e6 
					 
					
						
						
							
							Global changes to parallel_for structure.  
						
						... 
						
						
						
						Move the comms flags to more sensible names 
						
						
					 
					
						2017-02-21 05:24:27 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						41009cc142 
					 
					
						
						
							
							Move excange into the stencil only; keep Cshift fully general  
						
						
						
						
					 
					
						2017-02-20 17:48:04 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8a29c16bde 
					 
					
						
						
							
							Faster gather exchange  
						
						
						
						
					 
					
						2017-02-16 23:52:22 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bd600702cf 
					 
					
						
						
							
							Vectorise the XYZT face gathering better.  
						
						... 
						
						
						
						Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency. 
						
						
					 
					
						2017-02-15 11:11:04 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						85c7bc4321 
					 
					
						
						
							
							Bug fixes for cases that physics code couldn't hit but latent  
						
						... 
						
						
						
						and discovered on KNL (long vector, y SIMD dir) and checker dir set to y.
Remove the assertions on these code paths now they are tested. 
						
						
					 
					
						2017-02-07 01:01:15 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4f8e636a43 
					 
					
						
						
							
							commVector  
						
						
						
						
					 
					
						2016-10-20 16:59:16 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9b39f35ae6 
					 
					
						
						
							
							commVector different for SHMEM compat  
						
						
						
						
					 
					
						2016-10-20 16:58:53 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7240d73184 
					 
					
						
						
							
							Parallelise the x faces; fix the segv on KNL with comms  
						
						
						
						
					 
					
						2016-10-11 22:21:07 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7223753355 
					 
					
						
						
							
							Rotate in a direction > 2 for simd_layout  
						
						
						
						
					 
					
						2016-04-19 15:35:15 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						db5e8050a8 
					 
					
						
						
							
							Attempts at some optimisation  
						
						
						
						
					 
					
						2016-02-18 22:33:58 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c9fadf97a5 
					 
					
						
						
							
							Simplify the compressor interface again.  
						
						
						
						
					 
					
						2016-02-17 18:16:45 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c650bb3f3d 
					 
					
						
						
							
							Very small merge speed up.  
						
						
						
						
					 
					
						2016-02-16 18:41:53 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						41c2b09184 
					 
					
						
						
							
							Shmem comms [NO MPI] target added. The dwf test runs and passes.  
						
						... 
						
						
						
						Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working.
But committing my current while I try a few experimentals. 
						
						
					 
					
						2016-02-14 14:24:38 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d19321dfde 
					 
					
						
						
							
							Overlap comms compute changes  
						
						
						
						
					 
					
						2016-01-10 19:20:16 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						aae8bf31a7 
					 
					
						
						
							
							Global edit adding copyright and license info to every source file.  
						
						
						
						
					 
					
						2016-01-02 14:51:32 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						145a295231 
					 
					
						
						
							
							Bug fix for stencil with large shifts (3+), would be important to naik term for example but did not  
						
						... 
						
						
						
						impact Wilson based nearest neighbour stencils. 
						
						
					 
					
						2015-12-30 19:29:48 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						473fa28a6c 
					 
					
						
						
							
							Partial optimisation; comms in x-dir for red black dslash will be slow as the checker skipping block strided  
						
						... 
						
						
						
						loops are non threadable. Will need to write a kernel for these instead and drive them with a lookup table
to make a look sufficiently simple to thread. 
						
						
					 
					
						2015-11-06 05:23:23 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						12c5ec813c 
					 
					
						
						
							
							Useful debug messages (commented out) are included for preservation in case I need to revisit this  
						
						
						
						
					 
					
						2015-11-04 09:59:27 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1271508ca2 
					 
					
						
						
							
							Bug fix for spread out in x (EO) direction.  
						
						... 
						
						
						
						This is really annoying -- it is very hard to thread the loops with the index
recursion on buffer offset in the red-black case. Must think of a good threading
solution here. 
						
						
					 
					
						2015-11-04 09:57:57 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0a9ebac514 
					 
					
						
						
							
							Gparity modifications in the Gparity compressor variant.  
						
						
						
						
					 
					
						2015-08-11 06:22:20 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1d0df449e8 
					 
					
						
						
							
							Reorganise of file naming  
						
						
						
						
					 
					
						2015-06-03 12:47:05 +01:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						b00a40dd65 
					 
					
						
						
							
							Const safety  
						
						
						
						
					 
					
						2015-06-01 12:25:59 +01:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						12c2562b96 
					 
					
						
						
							
							No compile fix on mpi target  
						
						
						
						
					 
					
						2015-05-31 22:50:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5644ab1e19 
					 
					
						
						
							
							Large scale change to support 5d fermion formulations.  
						
						... 
						
						
						
						Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson. 
						
						
					 
					
						2015-05-31 15:09:02 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						67fa5691e5 
					 
					
						
						
							
							Weak scale the benchmarks automatically.  
						
						
						
						
					 
					
						2015-05-28 13:47:01 +01:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						da46b56e85 
					 
					
						
						
							
							Adding support for doxygen generation  
						
						
						
						
					 
					
						2015-05-27 10:34:56 +09:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						1a24801246 
					 
					
						
						
							
							checked performance of new vector libaries.  
						
						... 
						
						
						
						Added check for c++11 support on the configure.ac 
						
						
					 
					
						2015-05-26 12:02:54 +09:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						9e29ac6549 
					 
					
						
						
							
							Completed implementation of new Grid_simd classes  
						
						... 
						
						
						
						Tested performance for SSE4, Ok.
AVX1/2, AVX512 yet untested 
						
						
					 
					
						2015-05-22 17:33:15 +09:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b00622302b 
					 
					
						
						
							
							gcc doesn't like collapse(2) for some reason I can't figure  
						
						
						
						
					 
					
						2015-05-15 11:36:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						48f425d31c 
					 
					
						
						
							
							I have made the Cshift work successfully with open mp threading in  
						
						... 
						
						
						
						every routine. Collapse(2) is now working under clang-omp++. 
						
						
					 
					
						2015-05-13 00:31:00 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6103c29ee3 
					 
					
						
						
							
							Threading support rework.  
						
						... 
						
						
						
						Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM. 
						
						
					 
					
						2015-05-12 07:51:41 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5555a852be 
					 
					
						
						
							
							Lots of changes required to compile for MIC under ICPC  
						
						
						
						
					 
					
						2015-05-10 23:29:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						25d523c0f4 
					 
					
						
						
							
							Shaken out stencil to the point where I think wilson dslash is correct.  
						
						... 
						
						
						
						Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise. 
						
						
					 
					
						2015-04-28 08:11:59 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f159495a9d 
					 
					
						
						
							
							Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required  
						
						
						
						
					 
					
						2015-04-27 13:45:07 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b32c14b433 
					 
					
						
						
							
							Got the NERSC IO working and fixed a bug in cshift.  
						
						
						
						
					 
					
						2015-04-22 22:46:48 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e5a25dfcb1 
					 
					
						
						
							
							Build reorg with which I am a bit happier  
						
						
						
						
					 
					
						2015-04-18 21:22:50 +01:00