| 
							
							
								 Peter Boyle | 269e00509e | Don't instantiate in header | 2019-06-03 14:51:24 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | a5e90b0ddc | Making the kernels more GPU happy | 2019-06-03 14:50:54 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 5622faf226 | pragma once ifdef guard | 2019-06-03 14:50:26 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 82ecd520c7 | Macos happy fix under nvcc | 2019-06-03 14:48:50 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | ffde81f22a | Nsimd() and coalesced support | 2019-05-25 12:44:07 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | d8098f1ecd | coalesced support | 2019-05-25 12:43:31 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | aca788cf4f | Move coalesced read into tensors | 2019-05-25 12:43:00 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | a0e9f3b0a0 | Plan for GPU port | 2019-05-20 09:46:19 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | a9342c6ae5 | Udpdate TODO afer gianluc marge | 2019-05-18 22:58:25 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | ee6f96d85c | Merge pull request #210 from grid-test-organisation/feature/gpu-port-develop Cayley fermion functions for GPUs | 2019-05-18 19:06:20 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 4e9df9e93c | GPU patches | 2019-05-18 17:43:11 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 9fe68857a9 | Runs multiGPU with coalesced access on tesseract | 2019-05-18 17:42:41 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 37336c9e0c | Allow compress to be either vector or scalar types | 2019-05-18 17:41:13 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 6c4da3bbc7 | Stencil now runs with coalesced accesses | 2019-05-18 17:40:35 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | a584b16c4a | Adding a non-blocking kernel launch | 2019-05-18 17:39:54 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 1a82533d22 | fix inner product with thrust reduction | 2019-05-14 15:35:54 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | e3c56fd9b3 | CayleyZeroCounters before benchmark loop | 2019-05-13 15:52:00 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 955cc7790f | MooeeInvDag offloaded to GPU | 2019-05-13 14:25:29 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 1179123ac2 | MooeeInv offloaded to GPU | 2019-05-13 12:37:12 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 22e35c9ddd | M5Ddag offloaded to GPU | 2019-05-10 12:23:39 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 698b45e163 | remove unused typedef | 2019-05-09 11:19:39 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | f1744b3f01 | M5D offloaded to GPU | 2019-05-09 11:17:55 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 2b3c22f03d | bandwidth dependent on grid default precision | 2019-05-08 12:01:11 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 8423a05940 | duplicate CayleyFermion5D for gpu | 2019-05-08 11:51:37 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | d9438627d9 | M5D benchmark without vector copy overhead | 2019-05-02 11:10:57 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | b23305dbe2 | fix M5D flop count | 2019-05-02 11:08:21 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | d3b5c02e2d | measure M5D bandwidth and fix M5D flop count | 2019-05-02 11:02:39 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 8b6541fb60 | Fix gpu MultRealPart and MaddRealPart bug | 2019-05-02 10:58:17 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 6da9aa9971 | replace std::vector with Vector in benchmark | 2019-05-02 10:56:22 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 44e0360b97 | replace std::vector with Vector | 2019-05-02 10:55:36 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 9003c4a07c | allocator copy constructor (to be fixed) | 2019-05-02 10:53:37 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | b52fa38f8c | seed initialisation of RNG5 | 2019-05-02 10:36:09 +01:00 |  | 
			
				
					| 
							
							
								 gfilaci | 3f1c4d8789 | fix comment hash | 2019-05-02 10:24:36 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 60330e05a3 | NVCC wacky compiler options frozen. Possibly Cuda 9.2 specific | 2019-04-28 07:39:33 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | f9b8c0cccf | Vector changes for UVM | 2019-04-28 07:38:57 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 3cad67e569 | Compile on tesseract | 2019-04-28 07:38:09 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 170ba4e619 | Ensure different MPI ranks use different GPUs. The mapping works on Tesseract. | 2019-04-28 07:32:30 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 204a090497 | Inner product is not working on GPU. Why? | 2019-04-28 07:31:56 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 3c717c47ef | GPU no compile on Wilson Multigrid fixed | 2019-04-28 07:31:19 +01:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | c5e081d69c | Re-Merge branch 'develop' into feature/gpu-port Pull in Regensburg MultiGrid pull request | 2019-01-03 01:50:16 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 535a6aaf05 | Update todo list | 2019-01-02 22:07:51 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 91a7fe247b | Merge branch 'DanielRichtmann-feature/wilsonmg' into develop | 2019-01-02 14:40:31 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 8a1be021d3 | Merge branch 'feature/wilsonmg' of https://github.com/DanielRichtmann/Grid into DanielRichtmann-feature/wilsonmg | 2019-01-02 14:39:59 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | e73b909a48 | Make tests running past nvcc. Different NVCC versions proving tricky to keep happy. This is 9.2 | 2019-01-02 12:05:30 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | a4d9200293 | Fixing AVX 512 instantiation error. Need to move to extern templates urgently. | 2019-01-02 00:27:07 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 350508bdb3 | pugixml problem | 2019-01-01 16:38:54 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 38852737e4 | No compile fix on clang | 2019-01-01 15:55:13 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 802404c78c | Remove warnings under NVCC and move parallel_for to thread-loop | 2019-01-01 15:08:09 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | 0e9b591c1c | NVCC warning suppression | 2019-01-01 15:07:47 +00:00 |  | 
			
				
					| 
							
							
								 Peter Boyle | c43a2b599a | GPU support | 2019-01-01 15:07:29 +00:00 |  |