paboyle 
							
						 
					 
					
						
						
							
						
						b722889234 
					 
					
						
						
							
							Try a better load balancing loop  
						
						
						
						
					 
					
						2017-04-22 19:27:41 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						abba44a837 
					 
					
						
						
							
							Hand unrolled for overlapped comms  
						
						
						
						
					 
					
						2017-04-22 17:45:17 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f301be94ce 
					 
					
						
						
							
							Fixed  
						
						
						
						
					 
					
						2017-04-22 17:42:31 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1d1b225497 
					 
					
						
						
							
							Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).  
						
						
						
						
					 
					
						2017-04-22 09:05:28 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						53a785a3dd 
					 
					
						
						
							
							Fixing the KNL compile  
						
						
						
						
					 
					
						2017-04-22 08:11:51 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						736bf3c866 
					 
					
						
						
							
							Major rework of stencil. Half precision and MPI3 now working.  
						
						
						
						
					 
					
						2017-04-22 11:33:50 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b9bbe5d188 
					 
					
						
						
							
							L1p config bg/q  
						
						
						
						
					 
					
						2017-04-22 11:33:09 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3844bcf800 
					 
					
						
						
							
							If no f16c instructions supported must use software half precision conversion.  
						
						... 
						
						
						
						This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet. 
						
						
					 
					
						2017-04-20 15:30:52 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e1a2319d01 
					 
					
						
						
							
							Simple compressor moved out of cshift into stencil  
						
						
						
						
					 
					
						2017-04-20 13:18:15 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						180c732b4c 
					 
					
						
						
							
							Move compressors out of Cshift.  
						
						... 
						
						
						
						Slice iterators would help 
						
						
					 
					
						2017-04-20 13:17:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						957a706d0b 
					 
					
						
						
							
							Useful script  
						
						
						
						
					 
					
						2017-04-20 13:17:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d2312e9874 
					 
					
						
						
							
							Drop compressor entirely from Cshift to only Stencil.  
						
						
						
						
					 
					
						2017-04-20 13:16:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						fc4ab9ccd5 
					 
					
						
						
							
							Working half precision comms  
						
						
						
						
					 
					
						2017-04-20 11:20:26 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4a340aa5ca 
					 
					
						
						
							
							Massive compressor rework to support reduced precision comms  
						
						
						
						
					 
					
						2017-04-20 09:28:27 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3b7de792d5 
					 
					
						
						
							
							Type comparison in the traits work  
						
						
						
						
					 
					
						2017-04-18 13:28:04 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						557c3fa109 
					 
					
						
						
							
							Pretty change  
						
						
						
						
					 
					
						2017-04-18 13:27:38 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ec18e9f7f6 
					 
					
						
						
							
							Merge branch 'develop' into feature/half-prec-comms  
						
						
						
						
					 
					
						2017-04-18 11:39:39 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a839d5bc55 
					 
					
						
						
							
							Updated todo list  
						
						
						
						
					 
					
						2017-04-18 11:22:17 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						de41b84c5c 
					 
					
						
						
							
							Merge branch 'feature/normHP' into develop  
						
						
						
						
					 
					
						2017-04-18 10:57:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8e161152e4 
					 
					
						
						
							
							MultiRHS solver improvements with slice operations moved into lattice and sped up.  
						
						... 
						
						
						
						Block solver requires a lot of performance work. 
						
						
					 
					
						2017-04-18 10:51:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3141ebac10 
					 
					
						
						
							
							MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled.  
						
						
						
						
					 
					
						2017-04-17 10:50:19 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7ede696126 
					 
					
						
						
							
							Non compile of tests fixed  
						
						
						
						
					 
					
						2017-04-16 23:40:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bf516c3b81 
					 
					
						
						
							
							higher precision reduction variables in norm and inner product  
						
						
						
						
					 
					
						2017-04-15 12:27:28 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						441a52ee5d 
					 
					
						
						
							
							First cut at higher precision reduction  
						
						
						
						
					 
					
						2017-04-15 10:57:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a8db024c92 
					 
					
						
						
							
							Cleaning up the dense matrix and lanczos sector  
						
						
						
						
					 
					
						2017-04-15 08:54:11 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a9c22d5f43 
					 
					
						
						
							
							Verbose removal  
						
						
						
						
					 
					
						2017-04-14 14:38:49 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3ca41458a3 
					 
					
						
						
							
							Fix to no USE_FP16 case  
						
						
						
						
					 
					
						2017-04-14 14:20:54 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9e2d29c644 
					 
					
						
						
							
							USE_FP16 macro  
						
						
						
						
					 
					
						2017-04-14 14:17:14 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						951be75292 
					 
					
						
						
							
							Half precision conversion working on AVX512 now too  
						
						
						
						
					 
					
						2017-04-13 17:35:11 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b9113ed310 
					 
					
						
						
							
							Patches for knl  
						
						
						
						
					 
					
						2017-04-13 12:02:12 -04:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						1407418755 
					 
					
						
						
							
							Old qed-fvol program build disabled  
						
						
						
						
					 
					
						2017-04-13 15:32:30 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						a6a0da873f 
					 
					
						
						
							
							Merge branch 'feature/hadrons' into feature/qed-fvol  
						
						
						
						
					 
					
						2017-04-13 15:31:06 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						42fb49d3fd 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2017-04-13 14:12:47 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2a54c9aaab 
					 
					
						
						
							
							Merge branch 'feature/block-cg' into develop  
						
						
						
						
					 
					
						2017-04-13 14:12:24 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						0957378679 
					 
					
						
						
							
							Fixing conditional ugly way  
						
						
						
						
					 
					
						2017-04-13 13:47:56 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2ed6c76fc5 
					 
					
						
						
							
							Getting multiline if then fi working  
						
						
						
						
					 
					
						2017-04-13 13:43:13 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d3b9a7fa14 
					 
					
						
						
							
							F16c apparently requires AVX, even if the 128 bit are used.  
						
						... 
						
						
						
						Seems odd. 
						
						
					 
					
						2017-04-13 13:19:11 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						75ea306ce9 
					 
					
						
						
							
							Another try at travis  
						
						
						
						
					 
					
						2017-04-13 13:05:32 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4226c633c4 
					 
					
						
						
							
							Default to FP16 off again  
						
						
						
						
					 
					
						2017-04-13 12:51:39 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						5a4eafbf7e 
					 
					
						
						
							
							.travis  
						
						
						
						
					 
					
						2017-04-13 12:50:43 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						eb8e26018b 
					 
					
						
						
							
							Travis update for macos  
						
						
						
						
					 
					
						2017-04-13 12:35:11 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						db5ea001a3 
					 
					
						
						
							
							Update to use Xcode 8.3 since -mfp16 causes SIGILL  
						
						
						
						
					 
					
						2017-04-13 12:22:40 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2846f079e5 
					 
					
						
						
							
							Predicate tests on fp16 being enabled  
						
						
						
						
					 
					
						2017-04-13 12:08:05 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1d502e4ed6 
					 
					
						
						
							
							FP16 optional compile time  
						
						
						
						
					 
					
						2017-04-13 11:55:24 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						73cdf0fffe 
					 
					
						
						
							
							Drop f16c from SSE because of a macos compile error on travis  
						
						
						
						
					 
					
						2017-04-13 11:23:41 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1c25773319 
					 
					
						
						
							
							Trap illegal instructions  
						
						
						
						
					 
					
						2017-04-13 10:51:40 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c38400b26f 
					 
					
						
						
							
							Trap signals  
						
						
						
						
					 
					
						2017-04-13 10:35:20 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9c3065b860 
					 
					
						
						
							
							Debug flags off again  
						
						
						
						
					 
					
						2017-04-13 10:01:32 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						94eb829d08 
					 
					
						
						
							
							Align cast fixed for __mm128i gcc complained  
						
						
						
						
					 
					
						2017-04-13 08:40:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						68392ddb5b 
					 
					
						
						
							
							Exchange in generic  
						
						... 
						
						
						
						Precision change in AVX, SSE, AVX512, Generic. QPX still to do. 
						
						
					 
					
						2017-04-13 08:38:12 +01:00