paboyle 
							
						 
					 
					
						
						
							
						
						56277a11c8 
					 
					
						
						
							
							Build a list of whats on the surface  
						
						
						
						
					 
					
						2017-04-24 17:06:15 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						736bf3c866 
					 
					
						
						
							
							Major rework of stencil. Half precision and MPI3 now working.  
						
						
						
						
					 
					
						2017-04-22 11:33:50 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b9bbe5d188 
					 
					
						
						
							
							L1p config bg/q  
						
						
						
						
					 
					
						2017-04-22 11:33:09 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3844bcf800 
					 
					
						
						
							
							If no f16c instructions supported must use software half precision conversion.  
						
						... 
						
						
						
						This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet. 
						
						
					 
					
						2017-04-20 15:30:52 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4a340aa5ca 
					 
					
						
						
							
							Massive compressor rework to support reduced precision comms  
						
						
						
						
					 
					
						2017-04-20 09:28:27 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3b7de792d5 
					 
					
						
						
							
							Type comparison in the traits work  
						
						
						
						
					 
					
						2017-04-18 13:28:04 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8e161152e4 
					 
					
						
						
							
							MultiRHS solver improvements with slice operations moved into lattice and sped up.  
						
						... 
						
						
						
						Block solver requires a lot of performance work. 
						
						
					 
					
						2017-04-18 10:51:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7ede696126 
					 
					
						
						
							
							Non compile of tests fixed  
						
						
						
						
					 
					
						2017-04-16 23:40:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						441a52ee5d 
					 
					
						
						
							
							First cut at higher precision reduction  
						
						
						
						
					 
					
						2017-04-15 10:57:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3ca41458a3 
					 
					
						
						
							
							Fix to no USE_FP16 case  
						
						
						
						
					 
					
						2017-04-14 14:20:54 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						951be75292 
					 
					
						
						
							
							Half precision conversion working on AVX512 now too  
						
						
						
						
					 
					
						2017-04-13 17:35:11 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b9113ed310 
					 
					
						
						
							
							Patches for knl  
						
						
						
						
					 
					
						2017-04-13 12:02:12 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						db5ea001a3 
					 
					
						
						
							
							Update to use Xcode 8.3 since -mfp16 causes SIGILL  
						
						
						
						
					 
					
						2017-04-13 12:22:40 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1d502e4ed6 
					 
					
						
						
							
							FP16 optional compile time  
						
						
						
						
					 
					
						2017-04-13 11:55:24 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						73cdf0fffe 
					 
					
						
						
							
							Drop f16c from SSE because of a macos compile error on travis  
						
						
						
						
					 
					
						2017-04-13 11:23:41 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						94eb829d08 
					 
					
						
						
							
							Align cast fixed for __mm128i gcc complained  
						
						
						
						
					 
					
						2017-04-13 08:40:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						68392ddb5b 
					 
					
						
						
							
							Exchange in generic  
						
						... 
						
						
						
						Precision change in AVX, SSE, AVX512, Generic. QPX still to do. 
						
						
					 
					
						2017-04-13 08:38:12 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						cb6b81ae82 
					 
					
						
						
							
							Half precision conversion  
						
						
						
						
					 
					
						2017-04-12 19:32:37 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4ed10a3d06 
					 
					
						
						
							
							Merge branch 'develop' into feature/bgq-asm  
						
						
						
						
					 
					
						2017-03-13 11:10:10 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						7fe797daf8 
					 
					
						
						
							
							SIMD vector length sanity checks  
						
						
						
						
					 
					
						2017-02-23 16:49:44 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						486a01294a 
					 
					
						
						
							
							Corrected QPX SIMD width  
						
						
						
						
					 
					
						2017-02-23 16:47:56 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4e7ab3166f 
					 
					
						
						
							
							Refactoring header layout  
						
						
						
						
					 
					
						2017-02-22 18:09:33 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						c80948411b 
					 
					
						
						
							
							Added tRotate function and MaddRealPart struct for generic SIMD, bugfix in MultRealPart and minor cosmetic changes.  
						
						
						
						
					 
					
						2017-02-22 14:57:10 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						95625a7bd1 
					 
					
						
						
							
							Use Grid Integer type  
						
						
						
						
					 
					
						2017-02-22 13:09:32 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						0796696733 
					 
					
						
						
							
							Emulated integer vector type for QPX and generic SIMD instruction sets.  
						
						
						
						
					 
					
						2017-02-22 12:01:36 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						661fc4d3d1 
					 
					
						
						
							
							Debug AVX512 exchange code paths  
						
						
						
						
					 
					
						2017-02-20 17:48:36 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f246fe3304 
					 
					
						
						
							
							Improvements to avx for invertible to avoid latent bug  
						
						
						
						
					 
					
						2017-02-16 23:52:44 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bd600702cf 
					 
					
						
						
							
							Vectorise the XYZT face gathering better.  
						
						... 
						
						
						
						Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency. 
						
						
					 
					
						2017-02-15 11:11:04 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						0883d6a7ce 
					 
					
						
						
							
							Overlap comms compute support; make reg naming consistent with bgq aasm  
						
						
						
						
					 
					
						2017-02-07 00:59:32 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4bbdfb434c 
					 
					
						
						
							
							Overlap comms compute modifications  
						
						
						
						
					 
					
						2017-02-07 00:57:01 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c3b6d573b9 
					 
					
						
						
							
							Merge branch 'feature/bgq-asm' of  https://github.com/paboyle/Grid  into feature/bgq-asm  
						
						
						
						
					 
					
						2016-12-30 22:42:17 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1e179c903d 
					 
					
						
						
							
							Worried about integer; suspect where statements are broken  
						
						
						
						
					 
					
						2016-12-27 17:46:38 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1caa3fbc2d 
					 
					
						
						
							
							LOCK UNLOCK only  
						
						
						
						
					 
					
						2016-12-27 11:24:45 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						eabf316ed9 
					 
					
						
						
							
							BGQ performance ASM  
						
						
						
						
					 
					
						2016-12-22 21:56:08 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7dc36628a1 
					 
					
						
						
							
							QPX finishing  
						
						
						
						
					 
					
						2016-12-22 17:50:48 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3f2d53a994 
					 
					
						
						
							
							BGQ assembler beginning  
						
						
						
						
					 
					
						2016-12-20 10:21:26 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4b220972ac 
					 
					
						
						
							
							Warning fix  
						
						
						
						
					 
					
						2016-12-18 02:14:17 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						629f43e36c 
					 
					
						
						
							
							Return statement needed  
						
						
						
						
					 
					
						2016-12-18 02:09:37 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a3172b3455 
					 
					
						
						
							
							Precision error  
						
						
						
						
					 
					
						2016-12-18 02:07:45 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f17436fec2 
					 
					
						
						
							
							Bad commit fixed  
						
						
						
						
					 
					
						2016-12-18 01:27:34 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						69ae817d1c 
					 
					
						
						
							
							Updates for supporting Mobius better  
						
						
						
						
					 
					
						2016-12-08 16:43:28 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e27c6b217c 
					 
					
						
						
							
							Updating  
						
						
						
						
					 
					
						2016-12-01 12:42:53 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6adf35da54 
					 
					
						
						
							
							Faster Mobius  
						
						
						
						
					 
					
						2016-12-01 11:39:04 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						b18950f776 
					 
					
						
						
							
							Added simd real divide test with QPX divide fixes  
						
						
						
						
					 
					
						2016-11-25 13:21:33 +00:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						0acbf77bc6 
					 
					
						
						
							
							Add QPX Div structure  
						
						
						
						
					 
					
						2016-11-24 13:24:12 +00:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						a2cffb0304 
					 
					
						
						
							
							AVXFMA target fixed  
						
						
						
						
					 
					
						2016-11-21 17:47:18 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						97cddda49e 
					 
					
						
						
							
							Merge branch 'feature/gen-simd' into feature/doxygen  
						
						... 
						
						
						
						# Conflicts:
#	Makefile.am
#	configure.ac 
						
						
					 
					
						2016-11-19 13:11:13 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						b873504b90 
					 
					
						
						
							
							fully generic SIMD  
						
						
						
						
					 
					
						2016-11-19 01:32:39 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						042ae5b87c 
					 
					
						
						
							
							generic 256bits SIMD  
						
						
						
						
					 
					
						2016-11-15 12:16:15 +00:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						f7b60004f3 
					 
					
						
						
							
							Merge branch 'develop' into release/v0.6.0  
						
						
						
						
					 
					
						2016-11-04 16:08:07 +00:00