1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-24 02:32:02 +01:00

Commit Graph

  • 16e5d7945e Hard to make 5D vec work with GPU code Peter Boyle 2019-06-15 12:43:43 +01:00
  • decc99ca76 Accelerator version Peter Boyle 2019-06-15 12:43:00 +01:00
  • 464cd65931 Still to test this fully Peter Boyle 2019-06-15 12:35:14 +01:00
  • a1ec2f4723 Still to test this routine fully Peter Boyle 2019-06-15 12:33:55 +01:00
  • ea9662ec85 Thread loop changes Peter Boyle 2019-06-15 09:09:57 +01:00
  • 52c74f1cac Thread loop changes Peter Boyle 2019-06-15 09:08:16 +01:00
  • 9a13d2992c lean up Peter Boyle 2019-06-15 09:05:16 +01:00
  • b0449ae270 Thread loop changes Peter Boyle 2019-06-15 09:04:19 +01:00
  • 1299225105 Accelerator loop changes Peter Boyle 2019-06-15 09:03:46 +01:00
  • 5925e7f405 Thread for changes Peter Boyle 2019-06-15 09:01:30 +01:00
  • be1fd4930f Template instantiation make happy changes Peter Boyle 2019-06-15 08:37:34 +01:00
  • 377fa5dec1 looping construct Peter Boyle 2019-06-15 08:36:48 +01:00
  • e8b78f596e Looping construct changes Peter Boyle 2019-06-15 08:35:57 +01:00
  • 09720c40cd Coalesced loops Peter Boyle 2019-06-15 08:35:26 +01:00
  • bb024dd114 Loop construct changed Peter Boyle 2019-06-15 08:30:05 +01:00
  • 52456b9ec7 New loop construct Peter Boyle 2019-06-15 08:28:45 +01:00
  • b285138be4 Better checking on types Peter Boyle 2019-06-15 08:27:48 +01:00
  • c7dbf4c87e Scalar support for GPU threads Peter Boyle 2019-06-15 08:25:43 +01:00
  • 1e889c93b8 Insert a GPU synchronise Peter Boyle 2019-06-15 08:23:26 +01:00
  • 7379047482 Threading and acceleration primitives further changes. accelerator_barrier() needed and used Peter Boyle 2019-06-15 08:22:48 +01:00
  • d836ce3b78 Clean up of acceleration and threading primitives Peter Boyle 2019-06-15 08:14:21 +01:00
  • cefaacbc07 Changing accelerator loop. Still have work to do for multi-GPU code Peter Boyle 2019-06-15 08:10:24 +01:00
  • 0074ef7f69 thread loops Peter Boyle 2019-06-15 08:04:29 +01:00
  • 20359ca15f Coalesced loops. Peter Boyle 2019-06-15 08:03:57 +01:00
  • 736358b0cb Coalesced loops Peter Boyle 2019-06-15 08:03:13 +01:00
  • 6b692aa726 Thread loops Peter Boyle 2019-06-15 08:02:26 +01:00
  • 7f99e1cd3b Coalesced loops Peter Boyle 2019-06-15 08:01:39 +01:00
  • f3c89df948 Thread loop changes Peter Boyle 2019-06-15 08:00:37 +01:00
  • b7e6d111d7 Thread loop changes. Need to offload this file Peter Boyle 2019-06-15 07:59:10 +01:00
  • f39cf69c33 Accelerator loop change Peter Boyle 2019-06-15 07:58:23 +01:00
  • 8e27338df2 Rationalise number of loop macros Peter Boyle 2019-06-15 07:57:40 +01:00
  • bcbb5e9d26 Remove assembly tests Peter Boyle 2019-06-15 07:57:05 +01:00
  • 0ea7f5279d Accelerator loop changes Peter Boyle 2019-06-15 07:56:14 +01:00
  • 18e5de426d There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents. The conserved currents need rewritten using data parallel operations. Peter Boyle 2019-06-15 07:53:58 +01:00
  • e896d81235 Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean. Peter Boyle 2019-06-15 07:52:44 +01:00
  • 7b8ccff4f4 Accelerated coalesced loops in most cases Peter Boyle 2019-06-15 07:48:00 +01:00
  • 68541606ab Thread loop changes. Soon try these with accelerator loops and benchmark Peter Boyle 2019-06-15 07:46:42 +01:00
  • 339ea10cc7 First touch only on CPU code Peter Boyle 2019-06-15 07:45:43 +01:00
  • d0d8dc8042 Thread loop changes Peter Boyle 2019-06-15 07:45:09 +01:00
  • 81eb1fd9f2 Accelerator loop changes for coalesced access Peter Boyle 2019-06-15 07:44:47 +01:00
  • cb93d32cd9 Thread loop changes Peter Boyle 2019-06-15 07:44:08 +01:00
  • 8f223962ff Thread loop changed Peter Boyle 2019-06-15 07:43:42 +01:00
  • 9a8a63467e BC2 now runs. setup() runs twice, which had resulted in doubling up of momenta. Also fixed initialisation of momentum phases. Michael Marshall 2019-06-12 15:25:59 +01:00
  • 36f06555a2 Simplify Impl Peter Boyle 2019-06-09 22:26:27 +01:00
  • d6c0e0756d Remove GPU version Peter Boyle 2019-06-09 11:23:42 +01:00
  • 3e41b1055c Remove Gpu only kernels. Peter Boyle 2019-06-09 11:20:01 +01:00
  • 9fbcfe612c Update TODO list Peter Boyle 2019-06-09 11:19:38 +01:00
  • e78a5e7838 ASM instantiation without link errors Peter Boyle 2019-06-09 01:25:21 +01:00
  • da8d87e9da Cuda switch off Peter Boyle 2019-06-08 17:11:38 +01:00
  • 8e3a05d89b Moving the instantiation into a cleaner structure Peter Boyle 2019-06-08 13:48:33 +01:00
  • 8adc5da7dd Testig out approaches to kernel writing introducing SIMT_loop temporarily Peter Boyle 2019-06-08 13:47:04 +01:00
  • 29a244e423 Test of using a lane variable instead of repeated reference to threadIdx.y Peter Boyle 2019-06-08 13:46:26 +01:00
  • 18cbfecf02 Use symlinks in find command Peter Boyle 2019-06-08 13:45:46 +01:00
  • c933ac2248 Temporarily introduce a SIMT_loop to test out approaches prior to making a global change to accelerator_loop Peter Boyle 2019-06-08 13:44:27 +01:00
  • ad2c433574 Instantiations move. Tried using Gianluca's suggestion about avoiding threadIdx but doesn't seem to make a difference. Will revisit this and probably remove the lane parameter from the coalescedRead Peter Boyle 2019-06-08 13:43:12 +01:00
  • 86e7fb6e86 Instantiation relocation Peter Boyle 2019-06-08 13:42:46 +01:00
  • fb91dda7be Hand instantiation moved location Peter Boyle 2019-06-08 13:42:26 +01:00
  • 82cf7bc5ab Move instantiation into fermion/instantiation Peter Boyle 2019-06-08 13:41:46 +01:00
  • e452cc0a22 Move static variables into instantiation .cc file Peter Boyle 2019-06-08 13:41:20 +01:00
  • 4d2b938166 Remove explict instantiation from here Peter Boyle 2019-06-08 13:41:01 +01:00
  • 10d16ab76c Remove explict instantiation from here Peter Boyle 2019-06-08 13:40:32 +01:00
  • 1f997fa484 Instantiate via explict .cc files for parallel make. Peter Boyle 2019-06-08 13:39:51 +01:00
  • dc5024e88c The GPU reduction was not working for me and causing errors. Need to revisit. Gianluca is working on deterministic reduction/ Peter Boyle 2019-06-08 13:39:11 +01:00
  • 6d77941990 Drop the 5D vec actions Peter Boyle 2019-06-08 13:38:05 +01:00
  • 0ee6e77cbc Compiles GPU and CPU, still gives good performance on CPU Peter Boyle 2019-06-05 13:28:16 +01:00
  • 18d3cde29a Compile on GPU workd Peter Boyle 2019-06-05 00:14:58 +01:00
  • 7323099966 Instatiation fix Peter Boyle 2019-06-05 00:14:38 +01:00
  • 6379651cdd Generic or GPU ready for benchmark test on GPU Peter Boyle 2019-06-05 00:13:52 +01:00
  • ba4fd756b9 Fix signature, but deprecating this loops style Peter Boyle 2019-06-05 00:12:36 +01:00
  • d185fc1ebf clean up instantiation Peter Boyle 2019-06-05 00:11:52 +01:00
  • 96b36d8367 Instantiation clean up Peter Boyle 2019-06-05 00:11:27 +01:00
  • 899f8b5065 Instantiation clean up 5d vec removal Peter Boyle 2019-06-05 00:11:05 +01:00
  • c8d0483fe9 Remove 5d vectorisation Peter Boyle 2019-06-05 00:10:37 +01:00
  • 0f214e5f76 Clean up instantiation Peter Boyle 2019-06-05 00:10:13 +01:00
  • 8eea568426 GPU loop ; presently differentiated with ifdef, find a way to unify. Peter Boyle 2019-06-05 00:09:28 +01:00
  • 9636324069 GPU happy code Peter Boyle 2019-06-05 00:08:54 +01:00
  • 8a5489d9e6 Move the loop into a central kernel call. Peter Boyle 2019-06-05 00:08:13 +01:00
  • 8113845f9c coalesce loop. Need to rationalise this file Peter Boyle 2019-06-04 23:49:29 +01:00
  • b47f73c222 GPU happy Peter Boyle 2019-06-04 21:30:39 +01:00
  • 5720ced0fd Simplifying Peter Boyle 2019-06-04 21:30:08 +01:00
  • 2c87b56b53 Making GPU happier Peter Boyle 2019-06-04 21:29:44 +01:00
  • dbad48d802 Remove Ls vectorised DWF Peter Boyle 2019-06-04 21:27:40 +01:00
  • 4557a1365a Remove Ls vectorised DWF Peter Boyle 2019-06-04 20:59:59 +01:00
  • 16e9b87d98 Remove Ls vectorised DWF as unused and hard to maintain Peter Boyle 2019-06-04 20:59:01 +01:00
  • 685eea3d0f Small cosmetic Peter Boyle 2019-06-04 20:58:14 +01:00
  • 65b48831fb Simplify code Peter Boyle 2019-06-04 20:56:30 +01:00
  • 57396fc595 Simplify code Peter Boyle 2019-06-04 20:56:23 +01:00
  • a2e199df50 Simplifying Cayley cases. Peter Boyle 2019-06-04 20:54:52 +01:00
  • 020346c848 WOrk list. Will have to clean up Fermion sector. Peter Boyle 2019-06-04 20:54:00 +01:00
  • c2625a127e Non blocking loop. Want to change the naming here. Peter Boyle 2019-06-04 20:52:59 +01:00
  • 8794d35c78 GPU Peter Boyle 2019-06-04 20:52:27 +01:00
  • 24bff6dbe6 Minor improvements Peter Boyle 2019-06-04 20:51:48 +01:00
  • 45b15d10d3 GPU happy changes Peter Boyle 2019-06-04 20:49:16 +01:00
  • 33d6bbe32b GPU must use accelerator vectors Peter Boyle 2019-06-04 20:48:52 +01:00
  • 7a1569bd46 Annoying, cannot rely on equivalence of Grid ComplexD adn Eigen Complex type on GPU. Solve with ComplexD typecasts but must be a better way Peter Boyle 2019-06-04 20:47:49 +01:00
  • 6e2e904a0e NVCC compiles happy. Start to develop strategy for writing generic code for GPU kernels and CPU kernels. Peter Boyle 2019-06-04 20:46:35 +01:00
  • d92a17f359 Suppress NVCC warnings in pugixml with pragma Peter Boyle 2019-06-04 20:45:53 +01:00
  • 47c063f984 Remove Ls Vec cases from benchmarks Peter Boyle 2019-06-04 20:45:35 +01:00
  • 7e27a5213a Tests builds clean. Peter Boyle 2019-06-04 20:45:20 +01:00
  • fe72dc099b Upgrade to Mojave forced me to reinstall MacPorts. These are the ports I installed to get Grid working Michael Marshall 2019-06-04 16:12:24 +01:00