paboyle 
							
						 
					 
					
						
						
							
						
						2d8bb4c594 
					 
					
						
						
							
							Tweaks  
						
						
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6d58cb2a68 
					 
					
						
						
							
							Enable reordering of the loops in the assembler for cache friendly.  
						
						... 
						
						
						
						This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching. 
						
						
					 
					
						2016-06-30 14:35:01 -07:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						565e9329ba 
					 
					
						
						
							
							Changed the colouring classes  
						
						
						
						
					 
					
						2016-06-30 16:51:03 +01:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						5e02392f9c 
					 
					
						
						
							
							Fixed compilation error for benchmark_dwf  
						
						... 
						
						
						
						Some parts were assuming floating point precision 
						
						
					 
					
						2016-06-20 12:30:51 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						87418e7df1 
					 
					
						
						
							
							Slightly faster prefetching perf.  
						
						
						
						
					 
					
						2016-06-13 02:32:52 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						55f65b81b5 
					 
					
						
						
							
							Improvements to the assembler interface that let us move chunks of the  
						
						... 
						
						
						
						site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work. 
						
						
					 
					
						2016-06-09 01:12:36 -07:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						d9408893b3 
					 
					
						
						
							
							Prefetching in the normal kernel implementation.  
						
						
						
						
					 
					
						2016-06-08 05:43:48 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8ac021de73 
					 
					
						
						
							
							Added a test an fixed it for red black precon Ls innermost vectorised DWF  
						
						
						
						
					 
					
						2016-06-07 13:16:56 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e503ef5590 
					 
					
						
						
							
							Cleaned up  
						
						
						
						
					 
					
						2016-06-07 00:11:36 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a7682b0060 
					 
					
						
						
							
							Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS  
						
						
						
						
					 
					
						2016-06-06 23:48:21 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						53d06046b0 
					 
					
						
						
							
							Compiling updates for KNL  
						
						
						
						
					 
					
						2016-06-03 03:47:54 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						139cc5f1ae 
					 
					
						
						
							
							Large change with KNL preparation  
						
						
						
						
					 
					
						2016-06-03 03:24:26 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						5341977948 
					 
					
						
						
							
							IMCI fixes. Thought I had committed these. The "real" disambiguation  
						
						... 
						
						
						
						between std::real and Grid::real shouldn't have been necessary and I don't
know why only the icpc v16.0 on babbage hits it.
May need a longer term rename of Grid::real or some careful EnableIf work. 
						
						
					 
					
						2016-04-30 03:34:16 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1e554350ac 
					 
					
						
						
							
							The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.  
						
						
						
						
					 
					
						2016-04-29 16:49:18 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c79ea0dcef 
					 
					
						
						
							
							Fixingn IMCI  
						
						
						
						
					 
					
						2016-04-22 21:52:54 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ba427abde9 
					 
					
						
						
							
							simd 5d  
						
						
						
						
					 
					
						2016-04-19 15:38:39 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9b6ab6db16 
					 
					
						
						
							
							simd in 5th dimension support  
						
						
						
						
					 
					
						2016-04-19 15:38:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						806a83d38b 
					 
					
						
						
							
							simd in fifth dim support for dwf  
						
						
						
						
					 
					
						2016-04-19 15:36:19 -07:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						339be37dba 
					 
					
						
						
							
							Debugging smeared HMC  
						
						
						
						
					 
					
						2016-04-13 17:00:14 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b1192a8908 
					 
					
						
						
							
							Benchmark_zmm added  
						
						
						
						
					 
					
						2016-04-06 03:00:07 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e8dddb1596 
					 
					
						
						
							
							Adding extra benchmark  
						
						
						
						
					 
					
						2016-04-06 10:32:54 +01:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						97d0d56bcb 
					 
					
						
						
							
							Debugging Smearing routines (set_fj)  
						
						
						
						
					 
					
						2016-04-06 17:58:43 +09:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						7c7ea35ffb 
					 
					
						
						
							
							Putting the Traceless Antihermitian part outside the deriv in pseudofermion actions  
						
						
						
						
					 
					
						2016-04-05 16:28:09 +09:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						4b1cf580e0 
					 
					
						
						
							
							Debugging the Smearing routines  
						
						
						
						
					 
					
						2016-04-05 16:19:30 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e67fc2be18 
					 
					
						
						
							
							Adding a trial for openmp overhead minimisation  
						
						
						
						
					 
					
						2016-03-31 16:00:37 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						8052556275 
					 
					
						
						
							
							Cleaning up the single/double kernel implementation switch  
						
						
						
						
					 
					
						2016-03-31 14:51:32 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						60d965f79e 
					 
					
						
						
							
							AVX512 improvements; sigfpe trapping too  
						
						
						
						
					 
					
						2016-03-30 08:42:34 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1ecbf9794d 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						
						
						
					 
					
						2016-03-30 08:37:55 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						c77b7ee897 
					 
					
						
						
							
							AddSub based alternate SU3 routine  
						
						
						
						
					 
					
						2016-03-28 17:55:22 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1e355a51e1 
					 
					
						
						
							
							Interface change  
						
						
						
						
					 
					
						2016-03-27 23:46:55 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						21abaf7e91 
					 
					
						
						
							
							Gamma sign change  
						
						
						
						
					 
					
						2016-03-28 00:35:45 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						165bffc2e7 
					 
					
						
						
							
							Avx512 changes for assembler kernels  
						
						
						
						
					 
					
						2016-03-26 22:25:45 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						644fd6d32e 
					 
					
						
						
							
							Build avx512 clean  
						
						
						
						
					 
					
						2016-03-25 09:35:33 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						090e7aa930 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/chulwoo-dec12-2015'  
						
						... 
						
						
						
						Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan. 
						
						
					 
					
						2016-03-08 09:55:14 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						325e745daa 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						
						
						
					 
					
						2016-03-02 07:04:03 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						61413565d0 
					 
					
						
						
							
							Back off the inlined spin proj as not working  
						
						
						
						
					 
					
						2016-03-02 07:03:09 -08:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						2d8bb356e3 
					 
					
						
						
							
							Smearing routines compile (still untested)  
						
						
						
						
					 
					
						2016-02-25 02:43:59 +09:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						a7251f28c7 
					 
					
						
						
							
							Stout smearing compiles (untested)  
						
						
						
						
					 
					
						2016-02-24 03:16:50 +09:00 
						 
				 
			
				
					
						
							
							
								Antonin Portelli 
							
						 
					 
					
						
						
							
						
						497e7e4c53 
					 
					
						
						
							
							BG/Q compatibility fix  
						
						
						
						
					 
					
						2016-02-23 15:57:38 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6aeaf6f568 
					 
					
						
						
							
							Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then  
						
						... 
						
						
						
						turned up problems on the BlueWaters Cray.
Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is. 
						
						
					 
					
						2016-02-21 08:03:21 -06:00 
						 
				 
			
				
					
						
							
							
								Jung 
							
						 
					 
					
						
						
							
						
						9f0d9ade68 
					 
					
						
						
							
							Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()  
						
						... 
						
						
						
						Checking in before cleaning up 
						
						
					 
					
						2016-02-20 02:50:32 -05:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						771235017d 
					 
					
						
						
							
							Adding smearing routines (development)  
						
						
						
						
					 
					
						2016-02-19 15:30:41 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3425751cb8 
					 
					
						
						
							
							Missing return value  
						
						
						
						
					 
					
						2016-02-19 01:06:03 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						22422a84d9 
					 
					
						
						
							
							Small problem in compressor fix  
						
						
						
						
					 
					
						2016-02-17 19:03:09 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c9fadf97a5 
					 
					
						
						
							
							Simplify the compressor interface again.  
						
						
						
						
					 
					
						2016-02-17 18:16:45 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						81395e85d1 
					 
					
						
						
							
							Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it.  
						
						
						
						
					 
					
						2016-02-16 13:56:44 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a0fc47c6f9 
					 
					
						
						
							
							Cheaper implementation  
						
						
						
						
					 
					
						2016-02-15 16:02:36 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e2f73e3ead 
					 
					
						
						
							
							Updates for shmem  
						
						
						
						
					 
					
						2016-02-10 16:50:32 -08:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						6371676a75 
					 
					
						
						
							
							Correcting some compilation errors for clang-sse  
						
						
						
						
					 
					
						2016-02-10 11:37:03 +09:00 
						 
				 
			
				
					
						
							
							
								Jung 
							
						 
					 
					
						
						
							
						
						bd84c23298 
					 
					
						
						
							
							definitions reconciled.  
						
						
						
						
					 
					
						2016-01-25 16:30:59 -05:00