Peter Boyle 
							
						 
					 
					
						
						
							
						
						a32ac287bb 
					 
					
						
						
							
							Hand unrolled version of dslash in a separate class.  
						
						... 
						
						
						
						Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
                   on ivybridge core. Raises Clang form 14.5 to 17.5 
						
						
					 
					
						2015-05-26 19:54:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						840754dd42 
					 
					
						
						
							
							Hand unrolled version of dslash in a separate class.  
						
						... 
						
						
						
						Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
                   on ivybridge core. Raises Clang form 14.5 to 17.5 
						
						
					 
					
						2015-05-26 19:54:03 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						201a110c51 
					 
					
						
						
							
							Better EO support letting Schur solver work  
						
						
						
						
					 
					
						2015-05-25 13:46:28 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1a9841a0f1 
					 
					
						
						
							
							Better EO support letting Schur solver work  
						
						
						
						
					 
					
						2015-05-25 13:46:28 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ea3240ad55 
					 
					
						
						
							
							Better EO support letting Schur solver work  
						
						
						
						
					 
					
						2015-05-25 13:46:28 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2d30e82dcb 
					 
					
						
						
							
							Improving even odd sector; lot of work and through required cleaning this  
						
						
						
						
					 
					
						2015-05-23 09:34:16 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						65f2e6b269 
					 
					
						
						
							
							Improving even odd sector; lot of work and through required cleaning this  
						
						
						
						
					 
					
						2015-05-23 09:34:16 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						64fcbd0387 
					 
					
						
						
							
							Improving even odd sector; lot of work and through required cleaning this  
						
						
						
						
					 
					
						2015-05-23 09:34:16 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2d8b5a8191 
					 
					
						
						
							
							Optimisation...  
						
						
						
						
					 
					
						2015-05-19 15:50:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						46ab8edf30 
					 
					
						
						
							
							Optimisation...  
						
						
						
						
					 
					
						2015-05-19 15:50:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8220794c44 
					 
					
						
						
							
							Optimisation...  
						
						
						
						
					 
					
						2015-05-19 15:50:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a6e1ea216d 
					 
					
						
						
							
							Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,  
						
						... 
						
						
						
						not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop. 
						
						
					 
					
						2015-05-19 13:57:35 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ffc00caea3 
					 
					
						
						
							
							Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,  
						
						... 
						
						
						
						not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop. 
						
						
					 
					
						2015-05-19 13:57:35 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						4dba8522a1 
					 
					
						
						
							
							Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,  
						
						... 
						
						
						
						not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop. 
						
						
					 
					
						2015-05-19 13:57:35 +01:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						6d2accba7b 
					 
					
						
						
							
							Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.  
						
						
						
						
					 
					
						2015-05-18 16:48:14 +09:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						cee363e28c 
					 
					
						
						
							
							Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.  
						
						
						
						
					 
					
						2015-05-18 16:48:14 +09:00 
						 
				 
			
				
					
						
							
							
								neo 
							
						 
					 
					
						
						
							
						
						b4cd37276b 
					 
					
						
						
							
							Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.  
						
						
						
						
					 
					
						2015-05-18 16:48:14 +09:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1887c77498 
					 
					
						
						
							
							Getting closer to having a wilson solver... introducing a first and untested  
						
						... 
						
						
						
						cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape. 
						
						
					 
					
						2015-05-18 07:47:05 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d0e4673a3f 
					 
					
						
						
							
							Getting closer to having a wilson solver... introducing a first and untested  
						
						... 
						
						
						
						cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape. 
						
						
					 
					
						2015-05-18 07:47:05 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						11cb3e9a01 
					 
					
						
						
							
							Getting closer to having a wilson solver... introducing a first and untested  
						
						... 
						
						
						
						cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape. 
						
						
					 
					
						2015-05-18 07:47:05 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e841395dfd 
					 
					
						
						
							
							Updating preparing for solvers etc..  
						
						
						
						
					 
					
						2015-05-16 23:35:08 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dc6b6bdc96 
					 
					
						
						
							
							Updating preparing for solvers etc..  
						
						
						
						
					 
					
						2015-05-16 23:35:08 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bf7ab0da7a 
					 
					
						
						
							
							Updating preparing for solvers etc..  
						
						
						
						
					 
					
						2015-05-16 23:35:08 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e3b61bdfce 
					 
					
						
						
							
							Forces inlining upon icpc  
						
						
						
						
					 
					
						2015-05-15 11:43:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0e7945fe54 
					 
					
						
						
							
							Forces inlining upon icpc  
						
						
						
						
					 
					
						2015-05-15 11:43:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a0d041b522 
					 
					
						
						
							
							Forces inlining upon icpc  
						
						
						
						
					 
					
						2015-05-15 11:43:49 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7f3ae64a31 
					 
					
						
						
							
							OMP dslash working  
						
						
						
						
					 
					
						2015-05-13 10:59:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0097b81778 
					 
					
						
						
							
							OMP dslash working  
						
						
						
						
					 
					
						2015-05-13 10:59:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e179828662 
					 
					
						
						
							
							OMP dslash working  
						
						
						
						
					 
					
						2015-05-13 10:59:22 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b4a570477c 
					 
					
						
						
							
							I have made the Cshift work successfully with open mp threading in  
						
						... 
						
						
						
						every routine. Collapse(2) is now working under clang-omp++. 
						
						
					 
					
						2015-05-13 00:31:00 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						541d52ab97 
					 
					
						
						
							
							I have made the Cshift work successfully with open mp threading in  
						
						... 
						
						
						
						every routine. Collapse(2) is now working under clang-omp++. 
						
						
					 
					
						2015-05-13 00:31:00 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						48f425d31c 
					 
					
						
						
							
							I have made the Cshift work successfully with open mp threading in  
						
						... 
						
						
						
						every routine. Collapse(2) is now working under clang-omp++. 
						
						
					 
					
						2015-05-13 00:31:00 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						65c91eae64 
					 
					
						
						
							
							Threading support rework.  
						
						... 
						
						
						
						Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM. 
						
						
					 
					
						2015-05-12 07:51:41 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						c6baa3e657 
					 
					
						
						
							
							Threading support rework.  
						
						... 
						
						
						
						Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM. 
						
						
					 
					
						2015-05-12 07:51:41 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6103c29ee3 
					 
					
						
						
							
							Threading support rework.  
						
						... 
						
						
						
						Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM. 
						
						
					 
					
						2015-05-12 07:51:41 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						fa5779537c 
					 
					
						
						
							
							Command line args and a general clean up  
						
						
						
						
					 
					
						2015-05-11 12:43:10 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b42453d1fd 
					 
					
						
						
							
							Command line args and a general clean up  
						
						
						
						
					 
					
						2015-05-11 12:43:10 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						379943abf5 
					 
					
						
						
							
							Command line args and a general clean up  
						
						
						
						
					 
					
						2015-05-11 12:43:10 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						242e447bc5 
					 
					
						
						
							
							Lots of changes required to compile for MIC under ICPC  
						
						
						
						
					 
					
						2015-05-10 23:29:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2203c6e597 
					 
					
						
						
							
							Lots of changes required to compile for MIC under ICPC  
						
						
						
						
					 
					
						2015-05-10 23:29:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5555a852be 
					 
					
						
						
							
							Lots of changes required to compile for MIC under ICPC  
						
						
						
						
					 
					
						2015-05-10 23:29:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						352bccf6ca 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						... 
						
						
						
						Conflicts:
	lib/qcd/Grid_qcd_wilson_dop.cc 
						
						
					 
					
						2015-05-10 15:37:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						4da2c2ea00 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						... 
						
						
						
						Conflicts:
	lib/qcd/Grid_qcd_wilson_dop.cc 
						
						
					 
					
						2015-05-10 15:37:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						48b9692845 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						... 
						
						
						
						Conflicts:
	lib/qcd/Grid_qcd_wilson_dop.cc 
						
						
					 
					
						2015-05-10 15:37:47 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						133493dc79 
					 
					
						
						
							
							Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.  
						
						... 
						
						
						
						This is a short term hack while I benchmark. 
						
						
					 
					
						2015-05-10 15:25:23 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dc7132af71 
					 
					
						
						
							
							Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.  
						
						... 
						
						
						
						This is a short term hack while I benchmark. 
						
						
					 
					
						2015-05-10 15:25:23 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						02ae26d091 
					 
					
						
						
							
							Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.  
						
						... 
						
						
						
						This is a short term hack while I benchmark. 
						
						
					 
					
						2015-05-10 15:25:23 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5415180676 
					 
					
						
						
							
							Wilson perf improvements with Gauge prefetching  
						
						
						
						
					 
					
						2015-05-06 06:37:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						52403d587c 
					 
					
						
						
							
							Wilson perf improvements with Gauge prefetching  
						
						
						
						
					 
					
						2015-05-06 06:37:21 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						55ccb8ccf4 
					 
					
						
						
							
							Wilson perf improvements with Gauge prefetching  
						
						
						
						
					 
					
						2015-05-06 06:37:21 +01:00