paboyle 
							
						 
					 
					
						
						
							
						
						18bde08d1b 
					 
					
						
						
							
							Merge branch 'feature/staggering' into develop  
						
						
						
						
					 
					
						2017-03-28 15:25:55 +09:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						fc93f0b2ec 
					 
					
						
						
							
							Save some code for static huge tlb's. It is ifdef'ed out but an interesting root only experiment.  
						
						... 
						
						
						
						No gain from it. 
						
						
					 
					
						2017-03-21 22:30:29 -04:00 
						 
				 
			
				
					
						
							
							
								Christopher Kelly 
							
						 
					 
					
						
						
							
						
						06a132e3f9 
					 
					
						
						
							
							Fixes to SHMEM comms  
						
						
						
						
					 
					
						2017-02-28 13:31:54 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						4e7ab3166f 
					 
					
						
						
							
							Refactoring header layout  
						
						
						
						
					 
					
						2017-02-22 18:09:33 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						3ae92fa2e6 
					 
					
						
						
							
							Global changes to parallel_for structure.  
						
						... 
						
						
						
						Move the comms flags to more sensible names 
						
						
					 
					
						2017-02-21 05:24:27 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						37720c4db7 
					 
					
						
						
							
							Count bytes off node only  
						
						
						
						
					 
					
						2017-02-20 17:47:40 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						5c0adf7bf2 
					 
					
						
						
							
							Make clang happy with parenthesis  
						
						
						
						
					 
					
						2017-02-16 23:51:33 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bd600702cf 
					 
					
						
						
							
							Vectorise the XYZT face gathering better.  
						
						... 
						
						
						
						Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency. 
						
						
					 
					
						2017-02-15 11:11:04 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a48ee6f0f2 
					 
					
						
						
							
							Don't use MPI3_leader any more. No real gain and complex  
						
						
						
						
					 
					
						2017-02-07 01:31:24 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						73547cca66 
					 
					
						
						
							
							MPI3 working i think  
						
						
						
						
					 
					
						2017-02-07 01:30:02 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						123c673db7 
					 
					
						
						
							
							Policy to control async or sync SendRecv  
						
						
						
						
					 
					
						2017-02-07 01:24:54 -05:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						61f82216e2 
					 
					
						
						
							
							Communicator Policy, NodeCount distinct from Rank count  
						
						
						
						
					 
					
						2017-02-07 01:22:53 -05:00 
						 
				 
			
				
					
						
					 
					
						
						
							
						
						fad743fbb1 
					 
					
						
						
							
							Build system sanity check: corrected several headers not in the <Grid/*> format  
						
						
						
						
					 
					
						2017-01-26 17:00:41 -08:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						668ca57702 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/paboyle/Grid  into feature/staggering  
						
						
						
						
					 
					
						2016-11-22 13:49:11 +00:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						f85b35314d 
					 
					
						
						
							
							Fix a routine for single node processor coor from rank  
						
						
						
						
					 
					
						2016-11-08 11:49:13 +00:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						6e548a8ad5 
					 
					
						
						
							
							Linux compile needed  
						
						
						
						
					 
					
						2016-11-04 11:34:16 +00:00 
						 
				 
			
				
					
						
							
							
								Azusa Yamaguchi 
							
						 
					 
					
						
						
							
						
						ee686a7d85 
					 
					
						
						
							
							Compiles now  
						
						
						
						
					 
					
						2016-11-03 16:58:23 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						f41a230b32 
					 
					
						
						
							
							Decrease mpi3l verbose  
						
						
						
						
					 
					
						2016-11-02 19:54:03 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						757a928f9a 
					 
					
						
						
							
							Improvement to use own SHM_OPEN call to avoid openmpi bug.  
						
						
						
						
					 
					
						2016-11-02 12:37:46 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						32375aca65 
					 
					
						
						
							
							Semaphore sleep/wake up on remote processes.  
						
						
						
						
					 
					
						2016-11-02 09:27:20 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						bb94ddd0eb 
					 
					
						
						
							
							Tidy up of mpi3; also some cleaning of the dslash controls.  
						
						
						
						
					 
					
						2016-11-02 08:07:09 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						791cb050c8 
					 
					
						
						
							
							Comms improvements  
						
						
						
						
					 
					
						2016-11-01 11:35:43 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						09f66100d3 
					 
					
						
						
							
							MPI 3 compile on non-linux  
						
						
						
						
					 
					
						2016-10-25 06:01:12 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						d7d92af09d 
					 
					
						
						
							
							Travis fail fix attempt  
						
						
						
						
					 
					
						2016-10-25 01:45:53 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						d97a27f483 
					 
					
						
						
							
							Verbose  
						
						
						
						
					 
					
						2016-10-25 01:05:31 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						7c3363b91e 
					 
					
						
						
							
							Compiles all comms targets  
						
						
						
						
					 
					
						2016-10-25 00:04:17 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						b94478fa51 
					 
					
						
						
							
							mpi, mpi3, shmem all compile.  
						
						... 
						
						
						
						mpi, mpi3 pass single node multi-rank 
						
						
					 
					
						2016-10-24 23:45:31 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						b6a65059a2 
					 
					
						
						
							
							Update to use shared memory to contain the stencil comms buffers  
						
						... 
						
						
						
						Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions 
						
						
					 
					
						2016-10-24 17:30:43 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						c190221fd3 
					 
					
						
						
							
							Internal SHM comms in non-simd directions working  
						
						... 
						
						
						
						Need to fix simd directions 
						
						
					 
					
						2016-10-22 18:14:27 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						910b8dd6a1 
					 
					
						
						
							
							use simd type  
						
						
						
						
					 
					
						2016-10-21 22:35:29 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						09fd5c43a7 
					 
					
						
						
							
							Reasonably fast version  
						
						
						
						
					 
					
						2016-10-21 15:17:39 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						fad96cf250 
					 
					
						
						
							
							StencilBufs  
						
						
						
						
					 
					
						2016-10-21 13:36:00 +01:00 
						 
				 
			
				
					
						
							
							
								azusayamaguchi 
							
						 
					 
					
						
						
							
						
						f331809c27 
					 
					
						
						
							
							Use variable type for loop  
						
						
						
						
					 
					
						2016-10-21 13:35:37 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						306160ad9a 
					 
					
						
						
							
							bcopy threaded  
						
						
						
						
					 
					
						2016-10-21 12:07:28 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a762b1fb71 
					 
					
						
						
							
							MPI3 working with a bounce through shared memory on my laptop.  
						
						... 
						
						
						
						Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node. 
						
						
					 
					
						2016-10-21 09:03:26 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b58adc6a4b 
					 
					
						
						
							
							commVector  
						
						
						
						
					 
					
						2016-10-20 17:00:15 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						5fe2b85cbd 
					 
					
						
						
							
							MPI3 and shared memory support  
						
						
						
						
					 
					
						2016-10-20 16:58:01 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						32bc7a6ab8 
					 
					
						
						
							
							MPI back out of change that hangs  
						
						... 
						
						
						
						AVX2 for clang, gcc needs the -mfma flag. 
						
						
					 
					
						2016-08-05 10:36:00 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						62601bb649 
					 
					
						
						
							
							Bug fix  
						
						
						
						
					 
					
						2016-07-08 20:46:29 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						ef97e32152 
					 
					
						
						
							
							Adding persistent communicators  
						
						
						
						
					 
					
						2016-07-08 17:16:08 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						680645f849 
					 
					
						
						
							
							Merge branch 'release/v0.5.0'  
						
						
						
						
					 
					
						2016-06-30 15:15:03 -07:00 
						 
				 
			
				
					
						
							
							
								Guido Cossu 
							
						 
					 
					
						
						
							
						
						5e02392f9c 
					 
					
						
						
							
							Fixed compilation error for benchmark_dwf  
						
						... 
						
						
						
						Some parts were assuming floating point precision 
						
						
					 
					
						2016-06-20 12:30:51 +01:00 
						 
				 
			
				
					
						
							
							
								Richard Rollins 
							
						 
					 
					
						
						
							
						
						86187d7cca 
					 
					
						
						
							
							Removed write to stdout in constructor for MPI CartesianCommunicator  
						
						
						
						
					 
					
						2016-06-14 15:34:20 +01:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						d6b64f47d9 
					 
					
						
						
							
							Uint64 sum for IO rates  
						
						
						
						
					 
					
						2016-03-16 02:27:22 -07:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a359f7a9f5 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/paboyle/Grid  
						
						
						
						
					 
					
						2016-03-11 16:07:07 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						b606deb3f0 
					 
					
						
						
							
							Uint64 gsum  
						
						
						
						
					 
					
						2016-03-11 16:06:54 -08:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						090e7aa930 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/chulwoo-dec12-2015'  
						
						... 
						
						
						
						Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan. 
						
						
					 
					
						2016-03-08 09:55:14 +00:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						e55c35734b 
					 
					
						
						
							
							Fix a nocompile  
						
						
						
						
					 
					
						2016-03-03 20:33:28 +00:00 
						 
				 
			
				
					
						
							
							
								Peter Boyle 
							
						 
					 
					
						
						
							
						
						6aeaf6f568 
					 
					
						
						
							
							Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then  
						
						... 
						
						
						
						turned up problems on the BlueWaters Cray.
Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is. 
						
						
					 
					
						2016-02-21 08:03:21 -06:00 
						 
				 
			
				
					
						
							
							
								paboyle 
							
						 
					 
					
						
						
							
						
						a3fbabf404 
					 
					
						
						
							
							Bug fix  
						
						
						
						
					 
					
						2016-02-18 18:08:24 +00:00