azusayamaguchi 
							
						 
					 
					
						
						
							
						
						460d0753a1 
					 
					
						
						
							
							Merge branch 'develop' into feature/mpi3  
						
						... 
						
						
						
						Conflicts:
	lib/simd/Grid_avx512.h 
						
						
					 
					
						2016-10-25 01:08:51 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						75ebd3a0d1 
					 
					
						
						
							
							Typo fixes and rotate for CLANG  
						
						
						
						
					 
					
						2016-10-21 22:34:29 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						20a091c3ed 
					 
					
						
						
							
							Intel vs. Clang intrinsics differences absorbed  
						
						
						
						
					 
					
						2016-10-21 09:08:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						811ca45473 
					 
					
						
						
							
							GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support  
						
						
						
						
					 
					
						2016-10-17 16:23:21 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						81f2aeaece 
					 
					
						
						
							
							KNL streaming stores, and KNL performance coutners  
						
						
						
						
					 
					
						2016-10-12 11:45:22 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						611b5d74ba 
					 
					
						
						
							
							Fix for AVX+FMA3 compilation  
						
						
						
						
					 
					
						2016-10-10 15:26:17 +01:00 
						 
				 
			
				
					
						
							
							
								Antonin Portelli 
							
						 
					 
					
						
						
							
						
						0724f7af75 
					 
					
						
						
							
							QPX single precision implementation  
						
						
						
						
					 
					
						2016-09-19 18:09:12 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						4d11a6f5f2 
					 
					
						
						
							
							first commit for QPX intrinsics  
						
						
						
						
					 
					
						2016-08-23 14:41:44 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						17097a93ec 
					 
					
						
						
							
							FFTW test ran over 4 mpi processes.  
						
						
						
						
					 
					
						2016-08-17 01:33:55 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						93d29bb699 
					 
					
						
						
							
							build system improvements after discussion with Peter  
						
						
						
						
					 
					
						2016-08-04 16:19:59 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						e9f30cab2c 
					 
					
						
						
							
							first working version for the new build system  
						
						
						
						
					 
					
						2016-07-30 17:53:18 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4908b77d46 
					 
					
						
						
							
							Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created  
						
						... 
						
						
						
						a HUGE amount of difficult to resolve and understand conflicts .
Wholesale formatting, reordering functions etc... in a central file like Tensor_class
or Grid_vector_types while others are also editing without making substantial functionality
changes creates pain. 
						
						
					 
					
						2016-07-15 20:59:07 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f4dd5062d7 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2016-07-15 19:26:06 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8f47d0b5ab 
					 
					
						
						
							
							Rotation needed for hopping term in fifth dim with Ls vectorised fields  
						
						
						
						
					 
					
						2016-07-14 23:45:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a0676beeb1 
					 
					
						
						
							
							Open up dependency on Eigen and FFTW  
						
						
						
						
					 
					
						2016-07-07 22:31:07 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						e3d5319470 
					 
					
						
						
							
							Debugged the real() and imag() functions and added tests to Test_Simd  
						
						
						
						
					 
					
						2016-07-06 14:16:03 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						fdfbf11c6d 
					 
					
						
						
							
							Merge branch 'develop' into temporary-smearing  
						
						
						
						
					 
					
						2016-07-04 18:45:10 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						9cb90f714e 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/develop' into temporary-smearing  
						
						
						
						
					 
					
						2016-07-04 17:28:40 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						1a6d65c6a4 
					 
					
						
						
							
							Converted set_uw and set_fj to all complex functions  
						
						
						
						
					 
					
						2016-07-03 10:27:43 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bdaa5b1767 
					 
					
						
						
							
							Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8fcefc021a 
					 
					
						
						
							
							Improved the prefetching when using cache blocking codes  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1445189361 
					 
					
						
						
							
							COntrol the prefetch strategy  
						
						
						
						
					 
					
						2016-06-30 14:35:02 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a25bec87d9 
					 
					
						
						
							
							Prefetch during save  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						2d8bb4c594 
					 
					
						
						
							
							Tweaks  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6d58cb2a68 
					 
					
						
						
							
							Enable reordering of the loops in the assembler for cache friendly.  
						
						... 
						
						
						
						This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching. 
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						87418e7df1 
					 
					
						
						
							
							Slightly faster prefetching perf.  
						
						
						
						
					 
					
						2016-06-13 02:32:52 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						55f65b81b5 
					 
					
						
						
							
							Improvements to the assembler interface that let us move chunks of the  
						
						... 
						
						
						
						site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work. 
						
						
					 
					
						2016-06-09 01:12:36 -07:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						d9408893b3 
					 
					
						
						
							
							Prefetching in the normal kernel implementation.  
						
						
						
						
					 
					
						2016-06-08 05:43:48 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						139cc5f1ae 
					 
					
						
						
							
							Large change with KNL preparation  
						
						
						
						
					 
					
						2016-06-03 03:24:26 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						9d5f693cbe 
					 
					
						
						
							
							empty SIMD fix  
						
						
						
						
					 
					
						2016-05-24 10:56:27 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						91e04056f9 
					 
					
						
						
							
							fix of the empty SIMD  
						
						
						
						
					 
					
						2016-05-12 19:24:10 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c23375cd65 
					 
					
						
						
							
							Testing travis CI integration  
						
						
						
						
					 
					
						2016-04-30 06:30:56 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c79ea0dcef 
					 
					
						
						
							
							Fixingn IMCI  
						
						
						
						
					 
					
						2016-04-22 21:52:54 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e3f141f82f 
					 
					
						
						
							
							Fixed SSE compile with typecasts  
						
						
						
						
					 
					
						2016-04-22 10:30:30 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a6dfa2386b 
					 
					
						
						
							
							GCC choked on intrinsics calls that ICPC did not  
						
						
						
						
					 
					
						2016-04-22 06:33:41 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						587f80cd93 
					 
					
						
						
							
							Updated to compile and pass under intel SDE  
						
						
						
						
					 
					
						2016-04-19 15:13:54 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						528eb773ad 
					 
					
						
						
							
							Merged.  
						
						... 
						
						
						
						Merge branch 'master' of https://github.com/paboyle/Grid  
						
						
					 
					
						2016-04-19 22:24:34 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e5657510b0 
					 
					
						
						
							
							Rotate support for Ls simd-ized  
						
						
						
						
					 
					
						2016-04-19 22:24:18 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f473919526 
					 
					
						
						
							
							Rotate support  
						
						
						
						
					 
					
						2016-04-19 22:23:51 +01:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						ab56ccdd25 
					 
					
						
						
							
							-Complete and working implementation of Grid_empty  
						
						
						
						
					 
					
						2016-04-15 13:17:42 -04:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f473ef7591 
					 
					
						
						
							
							Fixing the compile  
						
						
						
						
					 
					
						2016-03-31 07:47:42 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8052556275 
					 
					
						
						
							
							Cleaning up the single/double kernel implementation switch  
						
						
						
						
					 
					
						2016-03-31 14:51:32 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						83b15bfcdd 
					 
					
						
						
							
							Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign  
						
						
						
						
					 
					
						2016-03-30 08:39:39 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c77b7ee897 
					 
					
						
						
							
							AddSub based alternate SU3 routine  
						
						
						
						
					 
					
						2016-03-28 17:55:22 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b6c3bc574b 
					 
					
						
						
							
							Moving to a more coherent organisation of the inline assembly and arch dependencies.  
						
						
						
						
					 
					
						2016-03-28 16:24:37 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ad80f61fba 
					 
					
						
						
							
							AVX512 shaken out  
						
						
						
						
					 
					
						2016-03-28 00:38:05 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						165bffc2e7 
					 
					
						
						
							
							Avx512 changes for assembler kernels  
						
						
						
						
					 
					
						2016-03-26 22:25:45 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						644fd6d32e 
					 
					
						
						
							
							Build avx512 clean  
						
						
						
						
					 
					
						2016-03-25 09:35:33 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						2d8bb356e3 
					 
					
						
						
							
							Smearing routines compile (still untested)  
						
						
						
						
					 
					
						2016-02-25 02:43:59 +09:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						a7251f28c7 
					 
					
						
						
							
							Stout smearing compiles (untested)  
						
						
						
						
					 
					
						2016-02-24 03:16:50 +09:00