| 
							
							
								 paboyle | 32375aca65 | Semaphore sleep/wake up on remote processes. | 2016-11-02 09:27:20 +00:00 |  | 
			
				
					| 
							
							
								 paboyle | bb94ddd0eb | Tidy up of mpi3; also some cleaning of the dslash controls. | 2016-11-02 08:07:09 +00:00 |  | 
			
				
					| 
							
							
								 paboyle | 791cb050c8 | Comms improvements | 2016-11-01 11:35:43 +00:00 |  | 
			
				
					| 
							
							
								 paboyle | b820076b91 | Merge branch 'develop' into feature/mpi3 | 2016-10-25 06:02:33 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 09f66100d3 | MPI 3 compile on non-linux | 2016-10-25 06:01:12 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | d7d92af09d | Travis fail fix attempt | 2016-10-25 01:45:53 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 460d0753a1 | Merge branch 'develop' into feature/mpi3 Conflicts:
	lib/simd/Grid_avx512.h | 2016-10-25 01:08:51 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 8f8058f8a5 | More random bits on parallel seeding | 2016-10-25 01:05:52 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | d97a27f483 | Verbose | 2016-10-25 01:05:31 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 7c3363b91e | Compiles all comms targets | 2016-10-25 00:04:17 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | b94478fa51 | mpi, mpi3, shmem all compile. mpi, mpi3 pass single node multi-rank | 2016-10-24 23:45:31 +01:00 |  | 
			
				
					|  | 13bf0482e3 | FFT optimisation | 2016-10-24 19:25:40 +01:00 |  | 
			
				
					|  | a795b5705e | memory optimisation | 2016-10-24 19:25:15 +01:00 |  | 
			
				
					|  | 392e064513 | fast local peek-poke | 2016-10-24 19:24:21 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | b6a65059a2 | Update to use shared memory to contain the stencil comms buffers Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions | 2016-10-24 17:30:43 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | ea25a4d9ac | Works | 2016-10-23 06:10:05 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | c190221fd3 | Internal SHM comms in non-simd directions working Need to fix simd directions | 2016-10-22 18:14:27 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 0fcd2e7188 | Simplify the comms structure prior to implementing Shared memory direct bouncs | 2016-10-21 22:44:10 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 910b8dd6a1 | use simd type | 2016-10-21 22:35:29 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 75ebd3a0d1 | Typo fixes and rotate for CLANG | 2016-10-21 22:34:29 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 09fd5c43a7 | Reasonably fast version | 2016-10-21 15:17:39 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | f22317748f | Merge branch 'feature/mpi3' of https://github.com/paboyle/Grid into feature/mpi3 | 2016-10-21 13:36:35 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 6a9eae6b6b | Reporting improvements | 2016-10-21 13:36:18 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | fad96cf250 | StencilBufs | 2016-10-21 13:36:00 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | f331809c27 | Use variable type for loop | 2016-10-21 13:35:37 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 2c54a53d0a | Compile verbose reduce | 2016-10-21 12:12:14 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 306160ad9a | bcopy threaded | 2016-10-21 12:07:28 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 20a091c3ed | Intel vs. Clang intrinsics differences absorbed | 2016-10-21 09:08:36 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 202078eb1b | Cray / OpenSHMEM ordering differs | 2016-10-21 09:07:20 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | a762b1fb71 | MPI3 working with a bounce through shared memory on my laptop. Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node. | 2016-10-21 09:03:26 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 5b5925b8e5 | Forgot to add | 2016-10-20 17:09:40 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | b58adc6a4b | commVector | 2016-10-20 17:00:15 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | f9d5e95d72 | allocator template typedefs moved to AlignedAllocator | 2016-10-20 16:59:39 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 4f8e636a43 | commVector | 2016-10-20 16:59:16 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 9b39f35ae6 | commVector different for SHMEM compat | 2016-10-20 16:58:53 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 5fe2b85cbd | MPI3 and shared memory support | 2016-10-20 16:58:01 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | c7cccaaa69 | Comm vector for shmem | 2016-10-20 16:57:31 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | cbcfea466f | MPI3 | 2016-10-20 16:57:14 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 4955672fc3 | MPI3 | 2016-10-20 16:57:00 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 8c043da5b7 | SHMEM and comms allocator made different | 2016-10-20 16:56:05 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 3cbe974eb4 | Layout | 2016-10-20 16:55:21 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 7af9b87318 | Cache face tables to improve performance. Extract merge now looking poor. | 2016-10-18 09:51:37 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | 811ca45473 | GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support | 2016-10-17 16:23:21 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | bc1a4d40ba | Faster integer handling avoid push_back | 2016-10-17 16:16:44 +01:00 |  | 
			
				
					| 
							
							
								 paboyle | c8079e6621 | Time the face gateher in x-dir more carefully | 2016-10-13 22:28:50 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 8b0d171c9a | 32bit issue on the KNL code variant where byte offsets were stored | 2016-10-12 17:49:32 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 8bbd9ebc27 | Reversing changes to Stencil class | 2016-10-12 13:47:20 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 6472b431f0 | __rdpmc needed for gcc, clang++ | 2016-10-12 12:29:08 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | bd205a3293 | Fixing for non x86 and non KNL | 2016-10-12 12:09:15 +01:00 |  | 
			
				
					| 
							
							
								 azusayamaguchi | 496beffa88 | Fix non-KNL build | 2016-10-12 12:06:08 +01:00 |  |