Peter Boyle 
							
						 
					 
					
						
						
							
						
						99220f6531 
					 
					
						
						
							
							Fixes and better timing  
						
						
						
						
					 
					
						2017-04-26 17:24:11 -04:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						d2003f24f4 
					 
					
						
						
							
							Corrected incorrect usage of ExtractSlice for conserved current code.  
						
						
						
						
					 
					
						2017-04-26 17:25:28 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						6299dd35f5 
					 
					
						
						
							
							Hadrons: Added test of conserved current code. Tests Ward identities for conserved vector and partially conserved axial currents.  
						
						
						
						
					 
					
						2017-04-26 12:41:39 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						a39daecb62 
					 
					
						
						
							
							Removed make_5D const declaration to avoid compilation error  
						
						
						
						
					 
					
						2017-04-26 12:39:07 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						159770e21b 
					 
					
						
						
							
							Legal Banners added  
						
						
						
						
					 
					
						2017-04-26 09:32:57 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2a6d093749 
					 
					
						
						
							
							move the sudo: required to match locatoin on Guido's branch  
						
						
						
						
					 
					
						2017-04-26 09:15:34 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c947947fad 
					 
					
						
						
							
							sudo required suggested by guido  
						
						
						
						
					 
					
						2017-04-26 08:45:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f555b50547 
					 
					
						
						
							
							Merge branch 'feature/half-prec-comms' into develop  
						
						
						
						
					 
					
						2017-04-26 08:43:40 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						738c1a11c2 
					 
					
						
						
							
							longer nloop  
						
						
						
						
					 
					
						2017-04-26 08:43:20 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f8797e1e3e 
					 
					
						
						
							
							bug fix. works now and great face performance  
						
						
						
						
					 
					
						2017-04-26 03:14:02 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						fd1eb7de13 
					 
					
						
						
							
							Clean implementation of the exterior faces listing only those points on the boudary  
						
						
						
						
					 
					
						2017-04-26 02:34:52 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2ce898efa3 
					 
					
						
						
							
							Pretty code  
						
						
						
						
					 
					
						2017-04-26 02:34:25 -04:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						dc5a6404ea 
					 
					
						
						
							
							Hadrons: modules for testing conserved current contractions and sequential insertion.  
						
						
						
						
					 
					
						2017-04-25 22:08:33 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						44260643f6 
					 
					
						
						
							
							First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions.  
						
						
						
						
					 
					
						2017-04-25 18:00:24 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						1425afc72f 
					 
					
						
						
							
							Rare Kaon test fix  
						
						
						
						
					 
					
						2017-04-25 17:26:56 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ab66bac4e6 
					 
					
						
						
							
							Think I'm getting on top of the reduced cost exterior precomputed list of links  
						
						
						
						
					 
					
						2017-04-25 08:50:26 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						56277a11c8 
					 
					
						
						
							
							Build a list of whats on the surface  
						
						
						
						
					 
					
						2017-04-24 17:06:15 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						752048f410 
					 
					
						
						
							
							Merge branch 'develop' into feature/clover  
						
						
						
						
					 
					
						2017-04-24 14:41:20 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						916e9e1d3e 
					 
					
						
						
							
							Merge branch 'feature/half-prec-comms' of  https://github.com/paboyle/Grid  into feature/half-prec-comms  
						
						
						
						
					 
					
						2017-04-24 10:39:19 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5b55867a7a 
					 
					
						
						
							
							Slightly cheaper Ext assembly  
						
						
						
						
					 
					
						2017-04-24 05:36:11 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						3accb1ef89 
					 
					
						
						
							
							Debugged assemply split phase with interior suppression  
						
						
						
						
					 
					
						2017-04-23 19:30:19 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e3d0e31525 
					 
					
						
						
							
							Debugged assemply split phase with interior suppression  
						
						
						
						
					 
					
						2017-04-23 19:29:27 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5812eb8a8c 
					 
					
						
						
							
							Partially fixed. But the comms-overlap does not work yet.  
						
						
						
						
					 
					
						2017-04-22 18:50:25 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4dd3763294 
					 
					
						
						
							
							Use OMP as much as possible  
						
						
						
						
					 
					
						2017-04-22 20:35:20 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c429ace748 
					 
					
						
						
							
							Cleaner OpenMP use  
						
						
						
						
					 
					
						2017-04-22 20:28:42 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ac58565d0a 
					 
					
						
						
							
							Dangerous rewrite of the assembly. If I make a mistake the debug will be painful.  
						
						
						
						
					 
					
						2017-04-22 19:31:04 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3703b718aa 
					 
					
						
						
							
							Mark up a table if a given site only receives from itself; including MPI3 splitting info.  
						
						
						
						
					 
					
						2017-04-22 19:28:37 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b722889234 
					 
					
						
						
							
							Try a better load balancing loop  
						
						
						
						
					 
					
						2017-04-22 19:27:41 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						abba44a837 
					 
					
						
						
							
							Hand unrolled for overlapped comms  
						
						
						
						
					 
					
						2017-04-22 17:45:17 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f301be94ce 
					 
					
						
						
							
							Fixed  
						
						
						
						
					 
					
						2017-04-22 17:42:31 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1d1b225497 
					 
					
						
						
							
							Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).  
						
						
						
						
					 
					
						2017-04-22 09:05:28 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						53a785a3dd 
					 
					
						
						
							
							Fixing the KNL compile  
						
						
						
						
					 
					
						2017-04-22 08:11:51 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						736bf3c866 
					 
					
						
						
							
							Major rework of stencil. Half precision and MPI3 now working.  
						
						
						
						
					 
					
						2017-04-22 11:33:50 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b9bbe5d188 
					 
					
						
						
							
							L1p config bg/q  
						
						
						
						
					 
					
						2017-04-22 11:33:09 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3844bcf800 
					 
					
						
						
							
							If no f16c instructions supported must use software half precision conversion.  
						
						... 
						
						
						
						This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet. 
						
						
					 
					
						2017-04-20 15:30:52 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e1a2319d01 
					 
					
						
						
							
							Simple compressor moved out of cshift into stencil  
						
						
						
						
					 
					
						2017-04-20 13:18:15 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						180c732b4c 
					 
					
						
						
							
							Move compressors out of Cshift.  
						
						... 
						
						
						
						Slice iterators would help 
						
						
					 
					
						2017-04-20 13:17:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						957a706d0b 
					 
					
						
						
							
							Useful script  
						
						
						
						
					 
					
						2017-04-20 13:17:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d2312e9874 
					 
					
						
						
							
							Drop compressor entirely from Cshift to only Stencil.  
						
						
						
						
					 
					
						2017-04-20 13:16:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						fc4ab9ccd5 
					 
					
						
						
							
							Working half precision comms  
						
						
						
						
					 
					
						2017-04-20 11:20:26 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4a340aa5ca 
					 
					
						
						
							
							Massive compressor rework to support reduced precision comms  
						
						
						
						
					 
					
						2017-04-20 09:28:27 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3b7de792d5 
					 
					
						
						
							
							Type comparison in the traits work  
						
						
						
						
					 
					
						2017-04-18 13:28:04 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						557c3fa109 
					 
					
						
						
							
							Pretty change  
						
						
						
						
					 
					
						2017-04-18 13:27:38 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ec18e9f7f6 
					 
					
						
						
							
							Merge branch 'develop' into feature/half-prec-comms  
						
						
						
						
					 
					
						2017-04-18 11:39:39 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a839d5bc55 
					 
					
						
						
							
							Updated todo list  
						
						
						
						
					 
					
						2017-04-18 11:22:17 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						de41b84c5c 
					 
					
						
						
							
							Merge branch 'feature/normHP' into develop  
						
						
						
						
					 
					
						2017-04-18 10:57:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8e161152e4 
					 
					
						
						
							
							MultiRHS solver improvements with slice operations moved into lattice and sped up.  
						
						... 
						
						
						
						Block solver requires a lot of performance work. 
						
						
					 
					
						2017-04-18 10:51:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3141ebac10 
					 
					
						
						
							
							MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled.  
						
						
						
						
					 
					
						2017-04-17 10:50:19 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7ede696126 
					 
					
						
						
							
							Non compile of tests fixed  
						
						
						
						
					 
					
						2017-04-16 23:40:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bf516c3b81 
					 
					
						
						
							
							higher precision reduction variables in norm and inner product  
						
						
						
						
					 
					
						2017-04-15 12:27:28 +01:00