1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-19 02:01:02 +01:00
Commit Graph

4926 Commits

Author SHA1 Message Date
Peter Boyle 7f99e1cd3b Coalesced loops 2019-06-15 08:01:39 +01:00
Peter Boyle f3c89df948 Thread loop changes 2019-06-15 08:00:37 +01:00
Peter Boyle b7e6d111d7 Thread loop changes. Need to offload this file 2019-06-15 07:59:10 +01:00
Peter Boyle f39cf69c33 Accelerator loop change 2019-06-15 07:58:23 +01:00
Peter Boyle 8e27338df2 Rationalise number of loop macros 2019-06-15 07:57:40 +01:00
Peter Boyle bcbb5e9d26 Remove assembly tests 2019-06-15 07:57:05 +01:00
Peter Boyle 0ea7f5279d Accelerator loop changes 2019-06-15 07:56:14 +01:00
Peter Boyle 18e5de426d There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents.
The conserved currents need rewritten using data parallel operations.
2019-06-15 07:53:58 +01:00
Peter Boyle e896d81235 Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean. 2019-06-15 07:52:44 +01:00
Peter Boyle 7b8ccff4f4 Accelerated coalesced loops in most cases 2019-06-15 07:48:00 +01:00
Peter Boyle 68541606ab Thread loop changes. Soon try these with accelerator loops and benchmark 2019-06-15 07:46:42 +01:00
Peter Boyle 339ea10cc7 First touch only on CPU code 2019-06-15 07:45:43 +01:00
Peter Boyle d0d8dc8042 Thread loop changes 2019-06-15 07:45:09 +01:00
Peter Boyle 81eb1fd9f2 Accelerator loop changes for coalesced access 2019-06-15 07:44:47 +01:00
Peter Boyle cb93d32cd9 Thread loop changes 2019-06-15 07:44:08 +01:00
Peter Boyle 8f223962ff Thread loop changed 2019-06-15 07:43:42 +01:00
Peter Boyle 36f06555a2 Simplify Impl 2019-06-09 22:26:27 +01:00
Peter Boyle d6c0e0756d Remove GPU version 2019-06-09 11:23:42 +01:00
Peter Boyle 3e41b1055c Remove Gpu only kernels. 2019-06-09 11:20:01 +01:00
Peter Boyle 9fbcfe612c Update TODO list 2019-06-09 11:19:38 +01:00
Peter Boyle e78a5e7838 ASM instantiation without link errors 2019-06-09 01:25:21 +01:00
Peter Boyle da8d87e9da Cuda switch off 2019-06-08 17:11:38 +01:00
Peter Boyle 8e3a05d89b Moving the instantiation into a cleaner structure 2019-06-08 13:48:33 +01:00
Peter Boyle 8adc5da7dd Testig out approaches to kernel writing introducing SIMT_loop temporarily 2019-06-08 13:47:04 +01:00
Peter Boyle 29a244e423 Test of using a lane variable instead of repeated reference to threadIdx.y 2019-06-08 13:46:26 +01:00
Peter Boyle 18cbfecf02 Use symlinks in find command 2019-06-08 13:45:46 +01:00
Peter Boyle c933ac2248 Temporarily introduce a SIMT_loop to test out approaches prior to making a global change to
accelerator_loop
2019-06-08 13:44:27 +01:00
Peter Boyle ad2c433574 Instantiations move. Tried using Gianluca's suggestion about avoiding threadIdx but doesn't
seem to make a difference. Will revisit this and probably remove the lane parameter from the coalescedRead
2019-06-08 13:43:12 +01:00
Peter Boyle 86e7fb6e86 Instantiation relocation 2019-06-08 13:42:46 +01:00
Peter Boyle fb91dda7be Hand instantiation moved location 2019-06-08 13:42:26 +01:00
Peter Boyle 82cf7bc5ab Move instantiation into fermion/instantiation 2019-06-08 13:41:46 +01:00
Peter Boyle e452cc0a22 Move static variables into instantiation .cc file 2019-06-08 13:41:20 +01:00
Peter Boyle 4d2b938166 Remove explict instantiation from here 2019-06-08 13:41:01 +01:00
Peter Boyle 10d16ab76c Remove explict instantiation from here 2019-06-08 13:40:32 +01:00
Peter Boyle 1f997fa484 Instantiate via explict .cc files for parallel make. 2019-06-08 13:39:51 +01:00
Peter Boyle dc5024e88c The GPU reduction was not working for me and causing errors. Need to revisit.
Gianluca is working on deterministic reduction/
2019-06-08 13:39:11 +01:00
Peter Boyle 6d77941990 Drop the 5D vec actions 2019-06-08 13:38:05 +01:00
Peter Boyle 0ee6e77cbc Compiles GPU and CPU, still gives good performance on CPU 2019-06-05 13:28:16 +01:00
Peter Boyle 18d3cde29a Compile on GPU workd 2019-06-05 00:14:58 +01:00
Peter Boyle 7323099966 Instatiation fix 2019-06-05 00:14:38 +01:00
Peter Boyle 6379651cdd Generic or GPU ready for benchmark test on GPU 2019-06-05 00:13:52 +01:00
Peter Boyle ba4fd756b9 Fix signature, but deprecating this loops style 2019-06-05 00:12:36 +01:00
Peter Boyle d185fc1ebf clean up instantiation 2019-06-05 00:11:52 +01:00
Peter Boyle 96b36d8367 Instantiation clean up 2019-06-05 00:11:27 +01:00
Peter Boyle 899f8b5065 Instantiation clean up 5d vec removal 2019-06-05 00:11:05 +01:00
Peter Boyle c8d0483fe9 Remove 5d vectorisation 2019-06-05 00:10:37 +01:00
Peter Boyle 0f214e5f76 Clean up instantiation 2019-06-05 00:10:13 +01:00
Peter Boyle 8eea568426 GPU loop ; presently differentiated with ifdef, find a way to unify. 2019-06-05 00:09:28 +01:00
Peter Boyle 9636324069 GPU happy code 2019-06-05 00:08:54 +01:00
Peter Boyle 8a5489d9e6 Move the loop into a central kernel call. 2019-06-05 00:08:13 +01:00