| 
							
							
								 Peter Boyle | f39cf69c33 | Accelerator loop change | 2019-06-15 07:58:23 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8e27338df2 | Rationalise number of loop macros | 2019-06-15 07:57:40 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | bcbb5e9d26 | Remove assembly tests | 2019-06-15 07:57:05 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 0ea7f5279d | Accelerator loop changes | 2019-06-15 07:56:14 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 18e5de426d | There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents. The conserved currents need rewritten using data parallel operations. | 2019-06-15 07:53:58 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | e896d81235 | Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean. | 2019-06-15 07:52:44 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 7b8ccff4f4 | Accelerated coalesced loops in most cases | 2019-06-15 07:48:00 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 68541606ab | Thread loop changes. Soon try these with accelerator loops and benchmark | 2019-06-15 07:46:42 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 339ea10cc7 | First touch only on CPU code | 2019-06-15 07:45:43 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | d0d8dc8042 | Thread loop changes | 2019-06-15 07:45:09 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 81eb1fd9f2 | Accelerator loop changes for coalesced access | 2019-06-15 07:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | cb93d32cd9 | Thread loop changes | 2019-06-15 07:44:08 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8f223962ff | Thread loop changed | 2019-06-15 07:43:42 +01:00 |  | 
			
				
					| 
							
							
								 Michael Marshall | 9a8a63467e | BC2 now runs. setup() runs twice, which had resulted in doubling up of momenta. Also fixed initialisation of momentum phases. | 2019-06-12 15:25:59 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 36f06555a2 | Simplify Impl | 2019-06-09 22:26:27 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | d6c0e0756d | Remove GPU version | 2019-06-09 11:23:42 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 3e41b1055c | Remove Gpu only kernels. | 2019-06-09 11:20:01 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 9fbcfe612c | Update TODO list | 2019-06-09 11:19:38 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | e78a5e7838 | ASM instantiation without link errors | 2019-06-09 01:25:21 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | da8d87e9da | Cuda switch off | 2019-06-08 17:11:38 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8e3a05d89b | Moving the instantiation into a cleaner structure | 2019-06-08 13:48:33 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8adc5da7dd | Testig out approaches to kernel writing introducing SIMT_loop temporarily | 2019-06-08 13:47:04 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 29a244e423 | Test of using a lane variable instead of repeated reference to threadIdx.y | 2019-06-08 13:46:26 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 18cbfecf02 | Use symlinks in find command | 2019-06-08 13:45:46 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | c933ac2248 | Temporarily introduce a SIMT_loop to test out approaches prior to making a global change to accelerator_loop | 2019-06-08 13:44:27 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | ad2c433574 | Instantiations move. Tried using Gianluca's suggestion about avoiding threadIdx but doesn't seem to make a difference. Will revisit this and probably remove the lane parameter from the coalescedRead | 2019-06-08 13:43:12 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 86e7fb6e86 | Instantiation relocation | 2019-06-08 13:42:46 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | fb91dda7be | Hand instantiation moved location | 2019-06-08 13:42:26 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 82cf7bc5ab | Move instantiation into fermion/instantiation | 2019-06-08 13:41:46 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | e452cc0a22 | Move static variables into instantiation .cc file | 2019-06-08 13:41:20 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 4d2b938166 | Remove explict instantiation from here | 2019-06-08 13:41:01 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 10d16ab76c | Remove explict instantiation from here | 2019-06-08 13:40:32 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 1f997fa484 | Instantiate via explict .cc files for parallel make. | 2019-06-08 13:39:51 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | dc5024e88c | The GPU reduction was not working for me and causing errors. Need to revisit. Gianluca is working on deterministic reduction/ | 2019-06-08 13:39:11 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 6d77941990 | Drop the 5D vec actions | 2019-06-08 13:38:05 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 0ee6e77cbc | Compiles GPU and CPU, still gives good performance on CPU | 2019-06-05 13:28:16 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 18d3cde29a | Compile on GPU workd | 2019-06-05 00:14:58 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 7323099966 | Instatiation fix | 2019-06-05 00:14:38 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 6379651cdd | Generic or GPU ready for benchmark test on GPU | 2019-06-05 00:13:52 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | ba4fd756b9 | Fix signature, but deprecating this loops style | 2019-06-05 00:12:36 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | d185fc1ebf | clean up instantiation | 2019-06-05 00:11:52 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 96b36d8367 | Instantiation clean up | 2019-06-05 00:11:27 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 899f8b5065 | Instantiation clean up 5d vec removal | 2019-06-05 00:11:05 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | c8d0483fe9 | Remove 5d vectorisation | 2019-06-05 00:10:37 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 0f214e5f76 | Clean up instantiation | 2019-06-05 00:10:13 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8eea568426 | GPU loop ; presently differentiated with ifdef, find a way to unify. | 2019-06-05 00:09:28 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 9636324069 | GPU happy code | 2019-06-05 00:08:54 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8a5489d9e6 | Move the loop into a central kernel call. | 2019-06-05 00:08:13 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8113845f9c | coalesce loop. Need to rationalise this file | 2019-06-04 23:49:29 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | b47f73c222 | GPU happy | 2019-06-04 21:30:39 +01:00 |  |