Peter Boyle 
							
						 
					 
					
						
						
							
						
						f9b2fce93b 
					 
					
						
						
							
							Changing whole stencil class to be template and not just single functions  
						
						
						
						
					 
					
						2015-11-06 05:25:10 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						473fa28a6c 
					 
					
						
						
							
							Partial optimisation; comms in x-dir for red black dslash will be slow as the checker skipping block strided  
						
						... 
						
						
						
						loops are non threadable. Will need to write a kernel for these instead and drive them with a lookup table
to make a look sufficiently simple to thread. 
						
						
					 
					
						2015-11-06 05:23:23 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5d854c869c 
					 
					
						
						
							
							Stencil interface changes  
						
						
						
						
					 
					
						2015-11-06 05:22:33 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						880ff88362 
					 
					
						
						
							
							Comms optimisation  
						
						
						
						
					 
					
						2015-11-06 05:22:18 -06:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						4690acc3c8 
					 
					
						
						
							
							Don't know why peter committed these as they didn't compile  
						
						
						
						
					 
					
						2015-11-06 10:31:48 +00:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						3281745fde 
					 
					
						
						
							
							Exec info and linux check to stop non-portable code breaking  
						
						
						
						
					 
					
						2015-11-06 10:31:24 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1159de165c 
					 
					
						
						
							
							Asm option for AVX512  
						
						
						
						
					 
					
						2015-11-05 22:04:51 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						16c7993434 
					 
					
						
						
							
							Merge branch 'master' of github.com:paboyle/Grid  
						
						... 
						
						
						
						Conflicts:
	lib/simd/Grid_avx512.h
	lib/simd/Grid_imci.h 
						
						
					 
					
						2015-11-04 03:32:10 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						6be9716e6f 
					 
					
						
						
							
							New file  
						
						
						
						
					 
					
						2015-11-04 03:26:28 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4a41c885ed 
					 
					
						
						
							
							Use Linux kernel interface to hardware performance counters. Dead useful.  
						
						
						
						
					 
					
						2015-11-04 03:24:19 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						757b31ed42 
					 
					
						
						
							
							Threading for KNC mods.  
						
						
						
						
					 
					
						2015-11-04 03:22:14 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ac7d1f26ad 
					 
					
						
						
							
							Either blocking or lebesgue curve  
						
						
						
						
					 
					
						2015-11-04 03:19:16 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1a8bf938b3 
					 
					
						
						
							
							Use either sub-blocking or lebesgue  
						
						
						
						
					 
					
						2015-11-04 03:18:51 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						63a2993827 
					 
					
						
						
							
							Exec info an cache blocking  
						
						
						
						
					 
					
						2015-11-04 03:16:56 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4e65ad21ac 
					 
					
						
						
							
							Adding a routine for AVX512 / IMCI with explicit assembly implementations  
						
						
						
						
					 
					
						2015-11-04 03:15:08 -08:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						dfc1de6f60 
					 
					
						
						
							
							Merge branch 'master' of github.com:paboyle/Grid  
						
						
						
						
					 
					
						2015-11-04 05:14:26 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						3b7576ad53 
					 
					
						
						
							
							Switch off for now  
						
						
						
						
					 
					
						2015-11-04 05:13:29 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						9b5d31ffc1 
					 
					
						
						
							
							mac , mult routines  
						
						... 
						
						
						
						Lines# with '#' will be ignored, and an empty message aborts the commit. 
						
						
					 
					
						2015-11-04 03:10:34 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a38762159c 
					 
					
						
						
							
							Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.  
						
						... 
						
						
						
						Updated Grid_avx512.h 
						
						
					 
					
						2015-11-04 03:09:06 -08:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ffc5dab17f 
					 
					
						
						
							
							AMD FMA4 support added for Interlagos/BlueWaters  
						
						
						
						
					 
					
						2015-11-04 04:29:58 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						96608c70d1 
					 
					
						
						
							
							chrono causing some problems on Cray systems. Suspend use for now  
						
						
						
						
					 
					
						2015-11-04 04:28:31 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d35d63b171 
					 
					
						
						
							
							Algorithm in  
						
						
						
						
					 
					
						2015-11-04 04:27:44 -06:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						24044dbc56 
					 
					
						
						
							
							Debugged a problem with checkerboarded cshift in the checker dimension which arose  
						
						... 
						
						
						
						only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this 
						
						
					 
					
						2015-11-04 10:00:55 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						abb23df83f 
					 
					
						
						
							
							formatting only  
						
						
						
						
					 
					
						2015-11-04 10:00:27 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						12c5ec813c 
					 
					
						
						
							
							Useful debug messages (commented out) are included for preservation in case I need to revisit this  
						
						
						
						
					 
					
						2015-11-04 09:59:27 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1271508ca2 
					 
					
						
						
							
							Bug fix for spread out in x (EO) direction.  
						
						... 
						
						
						
						This is really annoying -- it is very hard to thread the loops with the index
recursion on buffer offset in the red-black case. Must think of a good threading
solution here. 
						
						
					 
					
						2015-11-04 09:57:57 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ec5af35166 
					 
					
						
						
							
							EO bug fix when spread out in x-direction  
						
						
						
						
					 
					
						2015-11-04 09:56:58 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0f59356e86 
					 
					
						
						
							
							Problem in comms fixed  
						
						
						
						
					 
					
						2015-11-02 00:00:15 +00:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						8709117aea 
					 
					
						
						
							
							Log: generalised Logger class to allow separate logs in Grid-based applications  
						
						
						
						
					 
					
						2015-10-27 17:31:13 +00:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						e6b9aa9076 
					 
					
						
						
							
							Config.h removed form repository  
						
						
						
						
					 
					
						2015-10-27 10:47:07 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8889af45ca 
					 
					
						
						
							
							FMA4 added  
						
						
						
						
					 
					
						2015-10-09 01:00:53 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						83afb2e26a 
					 
					
						
						
							
							Poly support for lanczos  
						
						
						
						
					 
					
						2015-10-09 00:43:21 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6d06bd9493 
					 
					
						
						
							
							Minor change in commented out code  
						
						
						
						
					 
					
						2015-10-09 00:42:21 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6ee23f409e 
					 
					
						
						
							
							Lanczos addition  
						
						
						
						
					 
					
						2015-10-09 00:41:00 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2d95dac6b6 
					 
					
						
						
							
							Lanczos untested/partially tested additions. In middle of shake out but at least compiles  
						
						
						
						
					 
					
						2015-10-09 00:40:25 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						814c79f38d 
					 
					
						
						
							
							SIMD improvements for mac and madd use in complex for avx, sse  
						
						
						
						
					 
					
						2015-10-09 00:38:52 +02:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						1878bf97d0 
					 
					
						
						
							
							Babbage fix  
						
						
						
						
					 
					
						2015-09-30 16:04:01 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a660ce716b 
					 
					
						
						
							
							No compile babbage fix  
						
						
						
						
					 
					
						2015-09-30 16:02:44 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f4b6d1dfea 
					 
					
						
						
							
							NGO stores reenabled  
						
						
						
						
					 
					
						2015-09-30 16:02:14 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						23813ac798 
					 
					
						
						
							
							No compile on babbage fix  
						
						
						
						
					 
					
						2015-09-30 16:01:28 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9f4f65cb46 
					 
					
						
						
							
							Added a decoupled memory system benchmark to remove thread synch overhead  
						
						
						
						
					 
					
						2015-09-26 18:23:57 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						64d64d1ab6 
					 
					
						
						
							
							Updating to modify non-inlining permute routines and hopefully get better reg use and  
						
						... 
						
						
						
						enhance performance. 
						
						
					 
					
						2015-09-25 08:55:04 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						5ef42add2d 
					 
					
						
						
							
							Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly  
						
						... 
						
						
						
						and drop swizzles in AVX512. Don't know why these compiled. 
						
						
					 
					
						2015-09-23 05:23:45 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2f38ebc446 
					 
					
						
						
							
							Reintroducing the hand unrolled loops  
						
						
						
						
					 
					
						2015-09-08 17:45:30 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						638d6675ee 
					 
					
						
						
							
							Tested rms dH is ~ dt^4 numerically, so believe the ForceGradient is correct now.  
						
						... 
						
						
						
						Paranoia makes me want to diddle with the FG step to ensure dt^2 reappears. 
						
						
					 
					
						2015-08-31 16:33:20 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						357c6ab46d 
					 
					
						
						
							
							Reunitarise. Complete the HMC and integrator changes.  
						
						
						
						
					 
					
						2015-08-31 16:32:04 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						755dca9533 
					 
					
						
						
							
							Added ForceGradient integrator. dH dropped so seems to work. Will only  
						
						... 
						
						
						
						believe it is right once I have pulled a dt^4 error scaling plot out. 
						
						
					 
					
						2015-08-31 06:23:02 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						29fd004d54 
					 
					
						
						
							
							Unified integrator and integrator algorithm into virtual class used as a policy for the  
						
						... 
						
						
						
						HMC. 
						
						
					 
					
						2015-08-30 13:39:19 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						aa52fdadcc 
					 
					
						
						
							
							Global edit on HMC sector -- making GaugeField a template parameter and  
						
						... 
						
						
						
						preparing to pass integrator, smearing, bc's as policy classes to hmc.
Propose to unify "integrator" and integrator algorithm in a base/derived
way to override step. Want to read through ForceGradient to ensure
that abstraction covers the force gradient case. 
						
						
					 
					
						2015-08-30 12:18:34 +01:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						76d752585b 
					 
					
						
						
							
							Started a tidy up in the HMC sector. Now comfortable with the two level integrators;  
						
						... 
						
						
						
						to a little figure out what Guido had done & why -- but there is a neat saving of force
evaluations across the nesting time boundary making use of linearity of the leapP in dt.
I cleaned up the printing, reduced the volume of code, in the process sharing printing
between all integrators. Placed an assert that the total integration time for all integrators
must match at end of trajectory.
Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both
Omelyan and with Leapfrog so substantial confidence gained. 
						
						
					 
					
						2015-08-29 17:18:43 +01:00