Christopher Kelly 
							
						 
					 
					
						
						
							
						
						9c106d625a 
					 
					
						
						
							
							Added HMC main program designed to reproduce the 16^3x32x16 DWF+I ensembles with beta=2.13 and Gparity BCs  
						
						
						
						
					 
					
						2021-01-25 15:07:44 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						6795bbca31 
					 
					
						
						
							
							Generalized GeneralEvenOddRatioRationalPseudoFermionAction such that the multi-shift CG algorithm can be overridden by derived classes  
						
						... 
						
						
						
						Added a mixed-precision variant of GeneralEvenOddRatioRationalPseudoFermionAction and a verification test against double prec class
Fixed non-const reference used in passing RHMC approx to multishift classes 
						
						
					 
					
						2021-01-25 14:22:31 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						d161c2dc35 
					 
					
						
						
							
							Improved formating of timing output in mixed-prec multishift  
						
						... 
						
						
						
						In test of mixed-prec multishift, added comparison against full double precision multishift both for timing and to cross-check the results 
						
						
					 
					
						2021-01-20 15:42:06 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						7a06826cf1 
					 
					
						
						
							
							Added option to NerscIO to disable exit on failing plaquette check allowing for circumvention of factor of 2 error in CPS-generated G-parity config headers  
						
						... 
						
						
						
						Adapted mixed-prec multi-shift test to new way to pass gauge BC directions and added cmdline option to perform the G-parity plaquette comparison with the corrected plaquette when loading config 
						
						
					 
					
						2021-01-20 13:31:50 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						c3712b8e06 
					 
					
						
						
							
							Merge branch 'develop' into feature/gparity_HMC  
						
						
						
						
					 
					
						2021-01-20 11:48:52 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						901ee77b84 
					 
					
						
						
							
							Mixed precision multishift test can now be performed with/without G-parity using cmdline check and can load a pregenerated configuration  
						
						
						
						
					 
					
						2021-01-20 11:45:44 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b0339bc5a4 
					 
					
						
						
							
							Merge branch 'feature/conjugate-bc-dirs' into develop  
						
						
						
						
					 
					
						2021-01-15 09:28:39 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						3c23a947cc 
					 
					
						
						
							
							Fixed test for very much non-unit det  
						
						
						
						
					 
					
						2021-01-15 09:16:02 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						56111bb823 
					 
					
						
						
							
							Merge branch 'develop' into feature/conjugate-bc-dirs  
						
						
						
						
					 
					
						2021-01-14 21:01:22 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						99445673f6 
					 
					
						
						
							
							Gparity fix, and plaquette IO  
						
						
						
						
					 
					
						2021-01-14 21:00:36 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						97a59643f7 
					 
					
						
						
							
							Red black coarse space  
						
						
						
						
					 
					
						2021-01-14 20:49:13 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						579595f547 
					 
					
						
						
							
							Red black on coarse space  
						
						
						
						
					 
					
						2021-01-14 20:48:35 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						281ac5fc12 
					 
					
						
						
							
							Red black support on coars  
						
						
						
						
					 
					
						2021-01-14 20:48:08 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d8fa903b02 
					 
					
						
						
							
							G5 on coarse spaces  
						
						
						
						
					 
					
						2021-01-14 20:47:28 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						eaff0f3aeb 
					 
					
						
						
							
							Gamma5 on coaree spaces  
						
						
						
						
					 
					
						2021-01-14 20:46:58 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						e8e20c01b2 
					 
					
						
						
							
							Coarsened vector test  
						
						
						
						
					 
					
						2021-01-14 20:46:21 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						a4afc3ea2a 
					 
					
						
						
							
							Red black coarse space  
						
						
						
						
					 
					
						2021-01-14 20:44:16 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						1b84f59273 
					 
					
						
						
							
							Added a mixed precision multishift algorithm for which the matrix multiplies are performed in single precision but the search directions are accumulated in double precision.  
						
						... 
						
						
						
						A reliable update step is performed at a tunable frequency to correct the residual. A final mixed-prec single-shift solve is performed on each pole to perform cleanup if necessary.
A test is provided to demonstrate the algorithm. 
						
						
					 
					
						2021-01-06 12:24:44 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						1fb41a4300 
					 
					
						
						
							
							Added copyLane function to Tensor_extract_merge.h which copies one lane of data from an input tensor object to a different lane of an output tensor object of potentially different precision  
						
						... 
						
						
						
						precisionChange lattice function now uses copyLane to remove need for temporary scalar objects, reducing register footprint and significantly improving performance 
						
						
					 
					
						2021-01-06 11:50:56 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						287bac946f 
					 
					
						
						
							
							ConjugateGradientMixedPrec now stores final true residual and uses the precisionChange workspaces for improved efficiency  
						
						
						
						
					 
					
						2021-01-06 09:50:41 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						80c14be65e 
					 
					
						
						
							
							Added core test to check precision change  
						
						
						
						
					 
					
						2021-01-06 09:34:44 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						d7a2a4852d 
					 
					
						
						
							
							Reimplemented precisionChange to run on GPUs. A workspace containing the mapping table can be optionally precomputed and reused for improved performance.  
						
						
						
						
					 
					
						2021-01-06 09:30:49 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						d185f2eaa7 
					 
					
						
						
							
							OneFlavourEvenOddRatioRationalPseudoFermionAction now derives from GeneralEvenOddRatioRationalPseudoFermionAction, simply performs transcription of parameters  
						
						
						
						
					 
					
						2020-12-23 16:26:10 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						813d4cd900 
					 
					
						
						
							
							Added test program that ensures the generic checkerboarded RHMC (with parameters set appropriately) gives the same answer as the existing 1f code  
						
						
						
						
					 
					
						2020-12-23 16:01:42 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						75c6c6b173 
					 
					
						
						
							
							General RHMC pseudofermion action now allows for different rational approximations to be used in the MD and action evaluation  
						
						
						
						
					 
					
						2020-12-23 11:19:26 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						220ad5e3ee 
					 
					
						
						
							
							Added more verbose log output to GeneralEvenOddRatioRationalPseudoFermionAction  
						
						... 
						
						
						
						In GeneralEvenOddRatioRationalPseudoFermionAction, setting the bounds check frequency to 0 now disables the check 
						
						
					 
					
						2020-12-22 11:08:22 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						ba5dc670a5 
					 
					
						
						
							
							Reimplemented GparityWilsonImpl::InsertForce5D to run efficiently on GPUs  
						
						... 
						
						
						
						Swapped order of templated tensor code and c-number specializations in Tensor_outer.h to fix compile issue with type deduction on Summit 
						
						
					 
					
						2020-12-22 10:10:07 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						3fe75bc7cb 
					 
					
						
						
							
							Merge pull request  #329  from nmeyer-ur/feature/a64fx-3  
						
						... 
						
						
						
						Revised dslash/dwf kernels for A64FX 
						
						
					 
					
						2020-12-20 08:17:15 -05:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						45d49d8648 
					 
					
						
						
							
							clean up  
						
						
						
						
					 
					
						2020-12-19 03:35:18 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						6013183361 
					 
					
						
						
							
							removed Asm impls  
						
						
						
						
					 
					
						2020-12-19 03:25:01 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						4b882e8056 
					 
					
						
						
							
							fixed lost bracket  
						
						
						
						
					 
					
						2020-12-19 03:09:20 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						3f9ae6e7e7 
					 
					
						
						
							
							Merge branch 'develop' into feature/a64fx-3  
						
						
						
						
					 
					
						2020-12-19 02:37:11 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						909acd55cd 
					 
					
						
						
							
							vnum variant for prefetches  
						
						
						
						
					 
					
						2020-12-19 02:00:22 +01:00 
						 
				 
			
				
					
						
							
							
								Nils Meyer 
							
						 
					 
					
						
						
							
						
						4dd9e39e0d 
					 
					
						
						
							
							up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1  
						
						
						
						
					 
					
						2020-12-19 00:54:31 +01:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						a0ca362690 
					 
					
						
						
							
							Added an RHMC pseudofermion action, GeneralEvenOddRatioRationalPseudoFermionAction, that works for an arbitrary fractional power, not just a square root  
						
						... 
						
						
						
						Added a test evolution for the above, Test_rhmc_EOWilsonRatioPowQuarter, demonstrating conservation of Hamiltonian
Fixed HMC ignoring the MetropolisTest parameter of HMCparameters 
						
						
					 
					
						2020-12-17 16:21:58 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						249b6e61ec 
					 
					
						
						
							
							For G-parity BCs the Nd-1 direction is now assumed to be the time direction and setting a twist in this direction will apply antiperiodic BCs  
						
						... 
						
						
						
						Added option to run Test_gparity with antiperiodic time BCs 
						
						
					 
					
						2020-12-17 14:09:00 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						7adb253e25 
					 
					
						
						
							
							Merge pull request  #328  from mmphys/feature/mrespatch  
						
						... 
						
						
						
						Enable existing conserved current code for CUDA 
						
						
					 
					
						2020-12-17 11:10:29 -05:00 
						 
				 
			
				
					
						
							
							
								Michael Marshall 
							
						 
					 
					
						
						
							
						
						873519e960 
					 
					
						
						
							
							Enable existing conserved current code for CUDA (compiles OK for CUDA 10.1). Add option to Test_cayley_mres to load a configuration  
						
						
						
						
					 
					
						2020-12-14 16:06:10 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9aec4a3c26 
					 
					
						
						
							
							SYCL  
						
						
						
						
					 
					
						2020-12-10 02:11:17 -08:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						70510d151b 
					 
					
						
						
							
							Merge pull request  #327  from paboyle/feature/gparity_twist_GPU  
						
						... 
						
						
						
						Feature/gparity twist gpu 
						
						
					 
					
						2020-12-07 12:02:20 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						9e7bacb5a4 
					 
					
						
						
							
							Merge branch 'develop' into feature/gparity_twist_GPU  
						
						
						
						
					 
					
						2020-12-07 11:55:39 -05:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						2ef1fa66a8 
					 
					
						
						
							
							Improved performance of G-parity kernel for GPUs by simplifying multLink implementation  
						
						
						
						
					 
					
						2020-12-07 11:53:35 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						cf76741ec6 
					 
					
						
						
							
							Intel DPCPP Gold happy now (compiles all, runs Benchmark_dwf_fp32 )  
						
						
						
						
					 
					
						2020-12-03 03:47:11 -08:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						497e7c1c40 
					 
					
						
						
							
							Duplicate code  
						
						
						
						
					 
					
						2020-12-02 17:55:30 -08:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						888eacd3b8 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2020-11-24 21:46:33 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						321f0f51b5 
					 
					
						
						
							
							Project to SU(N)  
						
						
						
						
					 
					
						2020-11-24 21:46:10 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						30ad9578a2 
					 
					
						
						
							
							Merge branch 'lehner-feature/gpt' into develop  
						
						
						
						
					 
					
						2020-11-24 06:10:24 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						9dce101586 
					 
					
						
						
							
							Merge branch 'feature/gpt' of  https://github.com/lehner/Grid  into lehner-feature/gpt  
						
						
						
						
					 
					
						2020-11-24 06:10:16 -05:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						97e264d0ff 
					 
					
						
						
							
							Christoph's changes  
						
						
						
						
					 
					
						2020-11-23 15:46:11 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						683a5e5bf5 
					 
					
						
						
							
							Stencil use host vector for integera table on enable-shared=no and mirror it on device  
						
						
						
						
					 
					
						2020-11-23 15:39:51 +00:00