Peter Boyle 
							
						 
					 
					
						
						
							
						
						82f71643a4 
					 
					
						
						
							
							Remove the norm in MdagM  
						
						
						
						
					 
					
						2020-05-12 17:55:53 -04:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						d15ccad8a7 
					 
					
						
						
							
							switched to vec* in Reduce  
						
						
						
						
					 
					
						2020-05-12 20:41:14 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						0009b5cee8 
					 
					
						
						
							
							updated SVE_README  
						
						
						
						
					 
					
						2020-05-12 19:02:33 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						20d1941a45 
					 
					
						
						
							
							enabled asm kernels for fixed-size A64FXFIXEDSIZE  
						
						
						
						
					 
					
						2020-05-12 19:01:12 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						d24d8e8398 
					 
					
						
						
							
							Use X-direction as more bits meaningful on CUDA.  
						
						... 
						
						
						
						2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume
e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.
Where 32 is gpu_threads * Nsimd = 8*4 
						
						
					 
					
						2020-05-12 10:35:49 -04:00 
						 
				 
			
				
					
						
							
							
								Christoph Lehner 
							
						 
					 
					
						
						
							
						
						162e4bb567 
					 
					
						
						
							
							no automatic prefetching for now  
						
						
						
						
					 
					
						2020-05-12 07:01:23 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						07c0c02f8c 
					 
					
						
						
							
							Speed up Cshift  
						
						
						
						
					 
					
						2020-05-11 17:02:01 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						8c31c065b5 
					 
					
						
						
							
							Keep the Vector fixed to protect it from realloc  
						
						
						
						
					 
					
						2020-05-11 17:00:30 -04:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						b7c76ede29 
					 
					
						
						
							
							Removed some assertions in Test_simd and removed exit() in Reduce  
						
						
						
						
					 
					
						2020-05-11 22:43:00 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						05edf803bd 
					 
					
						
						
							
							corrected typo  
						
						
						
						
					 
					
						2020-05-12 03:59:59 +09:00 
						 
				 
			
				
					
						
							
							
								Christoph Lehner 
							
						 
					 
					
						
						
							
						
						b1c86900b2 
					 
					
						
						
							
							Merge pull request  #4  from paboyle/develop  
						
						... 
						
						
						
						merge 
						
						
					 
					
						2020-05-11 20:59:29 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						78b8e40f83 
					 
					
						
						
							
							switched to gcc's internal data types  
						
						
						
						
					 
					
						2020-05-11 18:11:23 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						fc2e9850d3 
					 
					
						
						
							
							temporarily enable TOFU by default when using A64FX or A64FXFIXEDSIZE  
						
						
						
						
					 
					
						2020-05-11 13:25:02 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						ffaaed679e 
					 
					
						
						
							
							MPI_THREAD_SINGLE hack for Fugaku, enabled by -DTOFU  
						
						
						
						
					 
					
						2020-05-11 13:21:39 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						bbbee5660d 
					 
					
						
						
							
							First compiile on HiP  
						
						
						
						
					 
					
						2020-05-10 05:28:09 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ea08f193e7 
					 
					
						
						
							
							Allocator cache spliit into large/small pools  
						
						
						
						
					 
					
						2020-05-10 05:24:26 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2bb2c68e15 
					 
					
						
						
							
							Separate pools for small and large allocations cache  
						
						
						
						
					 
					
						2020-05-09 22:57:21 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						efe5bc6a3c 
					 
					
						
						
							
							Split allocator cache into two pools of different sizes  
						
						
						
						
					 
					
						2020-05-09 22:27:56 -04:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						b2fd8b993a 
					 
					
						
						
							
							fixed-size clean up  
						
						
						
						
					 
					
						2020-05-09 22:53:42 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						291ee8c3d0 
					 
					
						
						
							
							updated fixed-size implementation; only Exch1 and prefetches missing  
						
						
						
						
					 
					
						2020-05-09 22:18:02 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						e1a5b3ea49 
					 
					
						
						
							
							unions for tables eliminate explicit loads, gcc does not complain  
						
						
						
						
					 
					
						2020-05-09 21:21:57 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						55a55660cb 
					 
					
						
						
							
							reverted changes  
						
						
						
						
					 
					
						2020-05-09 12:48:42 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						384da487bd 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2020-05-08 18:55:11 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						ee1de82a53 
					 
					
						
						
							
							Working ITT benchmark again  
						
						
						
						
					 
					
						2020-05-08 18:54:50 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						2b576fc185 
					 
					
						
						
							
							Comment deadd codde remove  
						
						
						
						
					 
					
						2020-05-08 18:54:29 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						52081acfa5 
					 
					
						
						
							
							NVCC compile fixes  
						
						
						
						
					 
					
						2020-05-08 13:14:12 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						b01b7f761a 
					 
					
						
						
							
							Merge pull request  #283  from DanielRichtmann/feature/minor-fixes  
						
						... 
						
						
						
						Some small fixes 
						
						
					 
					
						2020-05-08 10:52:03 -04:00 
						 
				 
			
				
					
						
							
							
								Daniel Richtmann 
							
						 
					 
					
						
						
							
						
						c83471bfd0 
					 
					
						
						
							
							Fix missing checkerboards for adj und conjugate  
						
						
						
						
					 
					
						2020-05-08 16:44:03 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Richtmann 
							
						 
					 
					
						
						
							
						
						ab0c5d77fb 
					 
					
						
						
							
							Correct NonHermitianSchurOperatorBase  
						
						
						
						
					 
					
						2020-05-08 16:44:02 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Richtmann 
							
						 
					 
					
						
						
							
						
						779e3c7442 
					 
					
						
						
							
							Const-correctness for retrieval routines of GridStopWatch  
						
						
						
						
					 
					
						2020-05-08 16:43:52 +02:00 
						 
				 
			
				
					
						
							
							
								Daniel Richtmann 
							
						 
					 
					
						
						
							
						
						0c570824f2 
					 
					
						
						
							
							Add missing declaration of GridCmdOptionInt  
						
						
						
						
					 
					
						2020-05-08 16:43:51 +02:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						f8b8e00090 
					 
					
						
						
							
							Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc  
						
						... 
						
						
						
						Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows 
						
						
					 
					
						2020-05-08 06:23:55 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						0dd1bdfa94 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into develop  
						
						
						
						
					 
					
						2020-05-08 09:21:43 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						1d65e2f62c 
					 
					
						
						
							
							Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics  
						
						... 
						
						
						
						Lifteed from HDCR setup 
						
						
					 
					
						2020-05-08 09:20:54 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						93920c4811 
					 
					
						
						
							
							Remove verbose  
						
						
						
						
					 
					
						2020-05-08 09:19:54 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6859a3e1d4 
					 
					
						
						
							
							Schur operator  
						
						
						
						
					 
					
						2020-05-08 09:19:12 -04:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						21ca182c36 
					 
					
						
						
							
							Comments remove  
						
						
						
						
					 
					
						2020-05-08 09:18:24 -04:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						ceb8b374da 
					 
					
						
						
							
							API change v3  
						
						
						
						
					 
					
						2020-05-08 15:04:44 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						4bc2ad2894 
					 
					
						
						
							
							API change v2  
						
						
						
						
					 
					
						2020-05-08 15:00:25 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						798af3e68f 
					 
					
						
						
							
							retry changing StoD API  
						
						
						
						
					 
					
						2020-05-08 14:34:59 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						b0ef2367f3 
					 
					
						
						
							
							testing alternate call to PrecisionChange  
						
						
						
						
					 
					
						2020-05-08 14:22:44 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						71a7350a85 
					 
					
						
						
							
							changed 2nd argument in Reduce to native vector type  
						
						
						
						
					 
					
						2020-05-08 12:26:51 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						6f79369955 
					 
					
						
						
							
							trying to get rid of macro definition error  
						
						
						
						
					 
					
						2020-05-08 12:19:24 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						f9cb6b979f 
					 
					
						
						
							
							corrected more typos  
						
						
						
						
					 
					
						2020-05-08 12:11:01 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						ed4d9d17f8 
					 
					
						
						
							
							corrected type  
						
						
						
						
					 
					
						2020-05-08 12:09:22 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						fbed02690d 
					 
					
						
						
							
							some changes in breaking out A64FX: use -DA64FXFIXEDSIZE for fixed size, but also define GEN  
						
						
						
						
					 
					
						2020-05-08 12:05:31 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						39f3ae5b1d 
					 
					
						
						
							
							corrected more types  
						
						
						
						
					 
					
						2020-05-08 11:07:14 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						e64bec8c8e 
					 
					
						
						
							
							pulled SVE typedefs out of Optimization  
						
						
						
						
					 
					
						2020-05-08 11:04:21 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						0893b4e552 
					 
					
						
						
							
							fixed typos in PrecisionChange  
						
						
						
						
					 
					
						2020-05-08 10:59:07 +02:00 
						 
				 
			
				
					
						
							
							
								nmeyer-ur 
							
						 
					 
					
						
						
							
						
						92f0f29670 
					 
					
						
						
							
							fixed double overloading vecf in Div, corrected typos  
						
						
						
						
					 
					
						2020-05-08 10:57:23 +02:00