Peter Boyle
|
7379047482
|
Threading and acceleration primitives further changes. accelerator_barrier() needed and used
|
2019-06-15 08:22:48 +01:00 |
|
Peter Boyle
|
d836ce3b78
|
Clean up of acceleration and threading primitives
|
2019-06-15 08:14:21 +01:00 |
|
Peter Boyle
|
cefaacbc07
|
Changing accelerator loop. Still have work to do for multi-GPU code
|
2019-06-15 08:10:24 +01:00 |
|
Peter Boyle
|
0074ef7f69
|
thread loops
|
2019-06-15 08:04:29 +01:00 |
|
Peter Boyle
|
20359ca15f
|
Coalesced loops.
|
2019-06-15 08:03:57 +01:00 |
|
Peter Boyle
|
736358b0cb
|
Coalesced loops
|
2019-06-15 08:03:13 +01:00 |
|
Peter Boyle
|
6b692aa726
|
Thread loops
|
2019-06-15 08:02:26 +01:00 |
|
Peter Boyle
|
7f99e1cd3b
|
Coalesced loops
|
2019-06-15 08:01:39 +01:00 |
|
Peter Boyle
|
f3c89df948
|
Thread loop changes
|
2019-06-15 08:00:37 +01:00 |
|
Peter Boyle
|
b7e6d111d7
|
Thread loop changes. Need to offload this file
|
2019-06-15 07:59:10 +01:00 |
|
Peter Boyle
|
f39cf69c33
|
Accelerator loop change
|
2019-06-15 07:58:23 +01:00 |
|
Peter Boyle
|
8e27338df2
|
Rationalise number of loop macros
|
2019-06-15 07:57:40 +01:00 |
|
Peter Boyle
|
bcbb5e9d26
|
Remove assembly tests
|
2019-06-15 07:57:05 +01:00 |
|
Peter Boyle
|
0ea7f5279d
|
Accelerator loop changes
|
2019-06-15 07:56:14 +01:00 |
|
Peter Boyle
|
18e5de426d
|
There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents.
The conserved currents need rewritten using data parallel operations.
|
2019-06-15 07:53:58 +01:00 |
|
Peter Boyle
|
e896d81235
|
Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean.
|
2019-06-15 07:52:44 +01:00 |
|
Peter Boyle
|
7b8ccff4f4
|
Accelerated coalesced loops in most cases
|
2019-06-15 07:48:00 +01:00 |
|
Peter Boyle
|
68541606ab
|
Thread loop changes. Soon try these with accelerator loops and benchmark
|
2019-06-15 07:46:42 +01:00 |
|
Peter Boyle
|
339ea10cc7
|
First touch only on CPU code
|
2019-06-15 07:45:43 +01:00 |
|
Peter Boyle
|
d0d8dc8042
|
Thread loop changes
|
2019-06-15 07:45:09 +01:00 |
|
Peter Boyle
|
81eb1fd9f2
|
Accelerator loop changes for coalesced access
|
2019-06-15 07:44:47 +01:00 |
|
Peter Boyle
|
cb93d32cd9
|
Thread loop changes
|
2019-06-15 07:44:08 +01:00 |
|
Peter Boyle
|
8f223962ff
|
Thread loop changed
|
2019-06-15 07:43:42 +01:00 |
|
Peter Boyle
|
36f06555a2
|
Simplify Impl
|
2019-06-09 22:26:27 +01:00 |
|
Peter Boyle
|
d6c0e0756d
|
Remove GPU version
|
2019-06-09 11:23:42 +01:00 |
|
Peter Boyle
|
3e41b1055c
|
Remove Gpu only kernels.
|
2019-06-09 11:20:01 +01:00 |
|
Peter Boyle
|
9fbcfe612c
|
Update TODO list
|
2019-06-09 11:19:38 +01:00 |
|
Peter Boyle
|
e78a5e7838
|
ASM instantiation without link errors
|
2019-06-09 01:25:21 +01:00 |
|
Peter Boyle
|
da8d87e9da
|
Cuda switch off
|
2019-06-08 17:11:38 +01:00 |
|
Peter Boyle
|
8e3a05d89b
|
Moving the instantiation into a cleaner structure
|
2019-06-08 13:48:33 +01:00 |
|
Peter Boyle
|
8adc5da7dd
|
Testig out approaches to kernel writing introducing SIMT_loop temporarily
|
2019-06-08 13:47:04 +01:00 |
|
Peter Boyle
|
29a244e423
|
Test of using a lane variable instead of repeated reference to threadIdx.y
|
2019-06-08 13:46:26 +01:00 |
|
Peter Boyle
|
18cbfecf02
|
Use symlinks in find command
|
2019-06-08 13:45:46 +01:00 |
|
Peter Boyle
|
c933ac2248
|
Temporarily introduce a SIMT_loop to test out approaches prior to making a global change to
accelerator_loop
|
2019-06-08 13:44:27 +01:00 |
|
Peter Boyle
|
ad2c433574
|
Instantiations move. Tried using Gianluca's suggestion about avoiding threadIdx but doesn't
seem to make a difference. Will revisit this and probably remove the lane parameter from the coalescedRead
|
2019-06-08 13:43:12 +01:00 |
|
Peter Boyle
|
86e7fb6e86
|
Instantiation relocation
|
2019-06-08 13:42:46 +01:00 |
|
Peter Boyle
|
fb91dda7be
|
Hand instantiation moved location
|
2019-06-08 13:42:26 +01:00 |
|
Peter Boyle
|
82cf7bc5ab
|
Move instantiation into fermion/instantiation
|
2019-06-08 13:41:46 +01:00 |
|
Peter Boyle
|
e452cc0a22
|
Move static variables into instantiation .cc file
|
2019-06-08 13:41:20 +01:00 |
|
Peter Boyle
|
4d2b938166
|
Remove explict instantiation from here
|
2019-06-08 13:41:01 +01:00 |
|
Peter Boyle
|
10d16ab76c
|
Remove explict instantiation from here
|
2019-06-08 13:40:32 +01:00 |
|
Peter Boyle
|
1f997fa484
|
Instantiate via explict .cc files for parallel make.
|
2019-06-08 13:39:51 +01:00 |
|
Peter Boyle
|
dc5024e88c
|
The GPU reduction was not working for me and causing errors. Need to revisit.
Gianluca is working on deterministic reduction/
|
2019-06-08 13:39:11 +01:00 |
|
Peter Boyle
|
6d77941990
|
Drop the 5D vec actions
|
2019-06-08 13:38:05 +01:00 |
|
Peter Boyle
|
0ee6e77cbc
|
Compiles GPU and CPU, still gives good performance on CPU
|
2019-06-05 13:28:16 +01:00 |
|
Peter Boyle
|
18d3cde29a
|
Compile on GPU workd
|
2019-06-05 00:14:58 +01:00 |
|
Peter Boyle
|
7323099966
|
Instatiation fix
|
2019-06-05 00:14:38 +01:00 |
|
Peter Boyle
|
6379651cdd
|
Generic or GPU ready for benchmark test on GPU
|
2019-06-05 00:13:52 +01:00 |
|
Peter Boyle
|
ba4fd756b9
|
Fix signature, but deprecating this loops style
|
2019-06-05 00:12:36 +01:00 |
|
Peter Boyle
|
d185fc1ebf
|
clean up instantiation
|
2019-06-05 00:11:52 +01:00 |
|