Chulwoo Jung 
							
						 
					 
					
						
						
							
						
						bc1f5be265 
					 
					
						
						
							
							Merge branch 'dev-IRBL-ypj' of  https://github.com/yongchull/Grid  into merge  
						
						
						
						
					 
					
						2018-03-08 18:02:06 -05:00 
						 
				 
			
				
					
						
							
							
								Chulwoo Jung 
							
						 
					 
					
						
						
							
						
						0b63e2e9cd 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into merge  
						
						
						
						
					 
					
						2018-03-07 15:24:11 -05:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						655a69259a 
					 
					
						
						
							
							Added support for GCC compilation for Skylake AVX512  
						
						
						
						
					 
					
						2018-01-28 17:02:46 +01:00 
						 
				 
			
				
					
						
							
							
								Yong-Chull Jang 
							
						 
					 
					
						
						
							
						
						53a9260a94 
					 
					
						
						
							
							patch to compile with AVX512 for SkyLake Xeon processor using GCC7.2.0. Beside bug fixes in the source code, a option 'SKL' is added to configure.ac for SkyLake processor specific AVX512 instruction flags when using GCC. Code can be compiled with --enable-simd=SKL using GCC 7.2.0, but Test_simd fails. AVX512 support for complex double type with non-intel compilers makes this error.  
						
						
						
						
					 
					
						2018-01-27 10:00:38 -05:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						507c4e9efc 
					 
					
						
						
							
							Correcting an missing semicolumn in avx512  
						
						
						
						
					 
					
						2018-01-27 10:59:55 +01:00 
						 
				 
			
				
					
						
							
							
								Yong-Chull Jang 
							
						 
					 
					
						
						
							
						
						3cb8cb7282 
					 
					
						
						
							
							'typename' is added to compile with AVX512 using GCC7.2.0; a semicolon was missing in Grid_avx512.h and the bug is fixed. Option SKL is added to configure script for skylake processor specific AVX512 operations. Code can be compiled with --enable-simd=SKL using GCC 7.2.0, but Test_simd fails. AVX512 support for complex double type with non-intel compilers makes this error; it needs a review.  
						
						
						
						
					 
					
						2017-12-23 14:54:07 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bfb68e6f02 
					 
					
						
						
							
							Merge pull request  #130  from giltirn/gparity-handunroll  
						
						... 
						
						
						
						Gparity handunroll 
						
						
					 
					
						2017-09-21 10:11:00 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						4e907fef2c 
					 
					
						
						
							
							Merge remote-tracking branch 'grid/develop' into feature/arm-neon  
						
						
						
						
					 
					
						2017-08-29 17:47:36 +02:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						f365a83fae 
					 
					
						
						
							
							In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions  
						
						
						
						
					 
					
						2017-08-25 14:24:11 -04:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						7a53dc3715 
					 
					
						
						
							
							Added integer reduce functionality  
						
						
						
						
					 
					
						2017-07-24 11:12:59 +02:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						8859a151cc 
					 
					
						
						
							
							Small corrections to the NEON port  
						
						
						
						
					 
					
						2017-06-29 11:30:29 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						688a39cfd9 
					 
					
						
						
							
							Merge pull request  #114  from nmeyer-ur/feature/arm-neon  
						
						... 
						
						
						
						ARM neon intrinsics support
Guido: checked and approved 
						
						
					 
					
						2017-06-29 09:57:17 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						0933aeefd4 
					 
					
						
						
							
							corrected Grid_neon.h  
						
						
						
						
					 
					
						2017-06-28 20:22:22 +02:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						a9c816a268 
					 
					
						
						
							
							moved file to correct folder  
						
						
						
						
					 
					
						2017-06-27 21:39:15 +02:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						bf729766dd 
					 
					
						
						
							
							removed collision with QPX implementation  
						
						
						
						
					 
					
						2017-06-27 20:32:24 +02:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						56abbdf4c2 
					 
					
						
						
							
							AVX512 integer reduce fix (for non-intel compiler)  
						
						
						
						
					 
					
						2017-06-23 11:09:14 +02:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						af71c63f4c 
					 
					
						
						
							
							AVX2 fix  
						
						
						
						
					 
					
						2017-06-23 11:03:12 +02:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						0440d4ce66 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into hotfix/bgq  
						
						
						
						
					 
					
						2017-06-22 17:09:42 +02:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						abc4de0fd2 
					 
					
						
						
							
							No compile make tests fix  
						
						
						
						
					 
					
						2017-06-19 22:03:03 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						a833f88c32 
					 
					
						
						
							
							Added missing SIMD integer reduction implementation for AVX, AVX-512, SSE4, IMCI  
						
						
						
						
					 
					
						2017-06-16 15:58:47 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						07b2c1b253 
					 
					
						
						
							
							Placeholder precision change functions to allow Grid to compile with QPX (warning: no actual functionality)  
						
						
						
						
					 
					
						2017-06-16 15:04:26 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						735cbdb983 
					 
					
						
						
							
							QPX Integer reduction (+ integer reduction test)  
						
						
						
						
					 
					
						2017-06-14 10:55:10 +01:00 
						 
				 
			
				
					
						
							
							
								Lanny91 
							
						 
					 
					
						
						
							
						
						2ad54c5a02 
					 
					
						
						
							
							QPX exchange support  
						
						
						
						
					 
					
						2017-06-14 10:53:39 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						3d04dc33c6 
					 
					
						
						
							
							ARM neon intrinsics support  
						
						
						
						
					 
					
						2017-06-13 13:26:59 +02:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						62cf9cf638 
					 
					
						
						
							
							Cleaner code  
						
						
						
						
					 
					
						2017-05-30 23:38:02 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						15e801af3f 
					 
					
						
						
							
							Fixing a compilation error for generic SIMD  
						
						
						
						
					 
					
						2017-05-19 16:39:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3267683e22 
					 
					
						
						
							
							Union workaround for g++  
						
						
						
						
					 
					
						2017-05-17 11:26:18 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c1c7566089 
					 
					
						
						
							
							GCC bug work around in 5.0 through 6.2 inclusive.  
						
						
						
						
					 
					
						2017-05-06 15:20:25 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						3344788fa1 
					 
					
						
						
							
							Merge branch 'develop' into feature/hmc_generalise  
						
						
						
						
					 
					
						2017-05-01 12:13:56 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						56277a11c8 
					 
					
						
						
							
							Build a list of whats on the surface  
						
						
						
						
					 
					
						2017-04-24 17:06:15 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						736bf3c866 
					 
					
						
						
							
							Major rework of stencil. Half precision and MPI3 now working.  
						
						
						
						
					 
					
						2017-04-22 11:33:50 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b9bbe5d188 
					 
					
						
						
							
							L1p config bg/q  
						
						
						
						
					 
					
						2017-04-22 11:33:09 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3844bcf800 
					 
					
						
						
							
							If no f16c instructions supported must use software half precision conversion.  
						
						... 
						
						
						
						This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet. 
						
						
					 
					
						2017-04-20 15:30:52 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4a340aa5ca 
					 
					
						
						
							
							Massive compressor rework to support reduced precision comms  
						
						
						
						
					 
					
						2017-04-20 09:28:27 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3b7de792d5 
					 
					
						
						
							
							Type comparison in the traits work  
						
						
						
						
					 
					
						2017-04-18 13:28:04 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8e161152e4 
					 
					
						
						
							
							MultiRHS solver improvements with slice operations moved into lattice and sped up.  
						
						... 
						
						
						
						Block solver requires a lot of performance work. 
						
						
					 
					
						2017-04-18 10:51:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						7ede696126 
					 
					
						
						
							
							Non compile of tests fixed  
						
						
						
						
					 
					
						2017-04-16 23:40:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						441a52ee5d 
					 
					
						
						
							
							First cut at higher precision reduction  
						
						
						
						
					 
					
						2017-04-15 10:57:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3ca41458a3 
					 
					
						
						
							
							Fix to no USE_FP16 case  
						
						
						
						
					 
					
						2017-04-14 14:20:54 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						951be75292 
					 
					
						
						
							
							Half precision conversion working on AVX512 now too  
						
						
						
						
					 
					
						2017-04-13 17:35:11 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b9113ed310 
					 
					
						
						
							
							Patches for knl  
						
						
						
						
					 
					
						2017-04-13 12:02:12 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						db5ea001a3 
					 
					
						
						
							
							Update to use Xcode 8.3 since -mfp16 causes SIGILL  
						
						
						
						
					 
					
						2017-04-13 12:22:40 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1d502e4ed6 
					 
					
						
						
							
							FP16 optional compile time  
						
						
						
						
					 
					
						2017-04-13 11:55:24 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						73cdf0fffe 
					 
					
						
						
							
							Drop f16c from SSE because of a macos compile error on travis  
						
						
						
						
					 
					
						2017-04-13 11:23:41 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						94eb829d08 
					 
					
						
						
							
							Align cast fixed for __mm128i gcc complained  
						
						
						
						
					 
					
						2017-04-13 08:40:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						68392ddb5b 
					 
					
						
						
							
							Exchange in generic  
						
						... 
						
						
						
						Precision change in AVX, SSE, AVX512, Generic. QPX still to do. 
						
						
					 
					
						2017-04-13 08:38:12 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						cb6b81ae82 
					 
					
						
						
							
							Half precision conversion  
						
						
						
						
					 
					
						2017-04-12 19:32:37 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						8c540333d5 
					 
					
						
						
							
							Merge branch 'develop' into feature/hmc_generalise  
						
						
						
						
					 
					
						2017-04-05 14:41:04 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						fd56b3ff38 
					 
					
						
						
							
							Merge branch 'develop' into feature/hmc_generalise  
						
						
						
						
					 
					
						2017-03-21 13:33:41 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4ed10a3d06 
					 
					
						
						
							
							Merge branch 'develop' into feature/bgq-asm  
						
						
						
						
					 
					
						2017-03-13 11:10:10 +00:00