Peter Boyle
|
18e5de426d
|
There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents.
The conserved currents need rewritten using data parallel operations.
|
2019-06-15 07:53:58 +01:00 |
|
Peter Boyle
|
e896d81235
|
Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean.
|
2019-06-15 07:52:44 +01:00 |
|
Peter Boyle
|
7b8ccff4f4
|
Accelerated coalesced loops in most cases
|
2019-06-15 07:48:00 +01:00 |
|
Peter Boyle
|
68541606ab
|
Thread loop changes. Soon try these with accelerator loops and benchmark
|
2019-06-15 07:46:42 +01:00 |
|
Peter Boyle
|
339ea10cc7
|
First touch only on CPU code
|
2019-06-15 07:45:43 +01:00 |
|
Peter Boyle
|
d0d8dc8042
|
Thread loop changes
|
2019-06-15 07:45:09 +01:00 |
|
Peter Boyle
|
81eb1fd9f2
|
Accelerator loop changes for coalesced access
|
2019-06-15 07:44:47 +01:00 |
|
Peter Boyle
|
cb93d32cd9
|
Thread loop changes
|
2019-06-15 07:44:08 +01:00 |
|
Peter Boyle
|
8f223962ff
|
Thread loop changed
|
2019-06-15 07:43:42 +01:00 |
|
Michael Marshall
|
9a8a63467e
|
BC2 now runs. setup() runs twice, which had resulted in doubling up of momenta. Also fixed initialisation of momentum phases.
|
2019-06-12 15:25:59 +01:00 |
|
Peter Boyle
|
36f06555a2
|
Simplify Impl
|
2019-06-09 22:26:27 +01:00 |
|
Peter Boyle
|
d6c0e0756d
|
Remove GPU version
|
2019-06-09 11:23:42 +01:00 |
|
Peter Boyle
|
3e41b1055c
|
Remove Gpu only kernels.
|
2019-06-09 11:20:01 +01:00 |
|
Peter Boyle
|
9fbcfe612c
|
Update TODO list
|
2019-06-09 11:19:38 +01:00 |
|
Peter Boyle
|
e78a5e7838
|
ASM instantiation without link errors
|
2019-06-09 01:25:21 +01:00 |
|
Peter Boyle
|
da8d87e9da
|
Cuda switch off
|
2019-06-08 17:11:38 +01:00 |
|
Peter Boyle
|
8e3a05d89b
|
Moving the instantiation into a cleaner structure
|
2019-06-08 13:48:33 +01:00 |
|
Peter Boyle
|
8adc5da7dd
|
Testig out approaches to kernel writing introducing SIMT_loop temporarily
|
2019-06-08 13:47:04 +01:00 |
|
Peter Boyle
|
29a244e423
|
Test of using a lane variable instead of repeated reference to threadIdx.y
|
2019-06-08 13:46:26 +01:00 |
|
Peter Boyle
|
18cbfecf02
|
Use symlinks in find command
|
2019-06-08 13:45:46 +01:00 |
|
Peter Boyle
|
c933ac2248
|
Temporarily introduce a SIMT_loop to test out approaches prior to making a global change to
accelerator_loop
|
2019-06-08 13:44:27 +01:00 |
|
Peter Boyle
|
ad2c433574
|
Instantiations move. Tried using Gianluca's suggestion about avoiding threadIdx but doesn't
seem to make a difference. Will revisit this and probably remove the lane parameter from the coalescedRead
|
2019-06-08 13:43:12 +01:00 |
|
Peter Boyle
|
86e7fb6e86
|
Instantiation relocation
|
2019-06-08 13:42:46 +01:00 |
|
Peter Boyle
|
fb91dda7be
|
Hand instantiation moved location
|
2019-06-08 13:42:26 +01:00 |
|
Peter Boyle
|
82cf7bc5ab
|
Move instantiation into fermion/instantiation
|
2019-06-08 13:41:46 +01:00 |
|
Peter Boyle
|
e452cc0a22
|
Move static variables into instantiation .cc file
|
2019-06-08 13:41:20 +01:00 |
|
Peter Boyle
|
4d2b938166
|
Remove explict instantiation from here
|
2019-06-08 13:41:01 +01:00 |
|
Peter Boyle
|
10d16ab76c
|
Remove explict instantiation from here
|
2019-06-08 13:40:32 +01:00 |
|
Peter Boyle
|
1f997fa484
|
Instantiate via explict .cc files for parallel make.
|
2019-06-08 13:39:51 +01:00 |
|
Peter Boyle
|
dc5024e88c
|
The GPU reduction was not working for me and causing errors. Need to revisit.
Gianluca is working on deterministic reduction/
|
2019-06-08 13:39:11 +01:00 |
|
Peter Boyle
|
6d77941990
|
Drop the 5D vec actions
|
2019-06-08 13:38:05 +01:00 |
|
Peter Boyle
|
0ee6e77cbc
|
Compiles GPU and CPU, still gives good performance on CPU
|
2019-06-05 13:28:16 +01:00 |
|
Peter Boyle
|
18d3cde29a
|
Compile on GPU workd
|
2019-06-05 00:14:58 +01:00 |
|
Peter Boyle
|
7323099966
|
Instatiation fix
|
2019-06-05 00:14:38 +01:00 |
|
Peter Boyle
|
6379651cdd
|
Generic or GPU ready for benchmark test on GPU
|
2019-06-05 00:13:52 +01:00 |
|
Peter Boyle
|
ba4fd756b9
|
Fix signature, but deprecating this loops style
|
2019-06-05 00:12:36 +01:00 |
|
Peter Boyle
|
d185fc1ebf
|
clean up instantiation
|
2019-06-05 00:11:52 +01:00 |
|
Peter Boyle
|
96b36d8367
|
Instantiation clean up
|
2019-06-05 00:11:27 +01:00 |
|
Peter Boyle
|
899f8b5065
|
Instantiation clean up 5d vec removal
|
2019-06-05 00:11:05 +01:00 |
|
Peter Boyle
|
c8d0483fe9
|
Remove 5d vectorisation
|
2019-06-05 00:10:37 +01:00 |
|
Peter Boyle
|
0f214e5f76
|
Clean up instantiation
|
2019-06-05 00:10:13 +01:00 |
|
Peter Boyle
|
8eea568426
|
GPU loop ; presently differentiated with ifdef, find a way to unify.
|
2019-06-05 00:09:28 +01:00 |
|
Peter Boyle
|
9636324069
|
GPU happy code
|
2019-06-05 00:08:54 +01:00 |
|
Peter Boyle
|
8a5489d9e6
|
Move the loop into a central kernel call.
|
2019-06-05 00:08:13 +01:00 |
|
Peter Boyle
|
8113845f9c
|
coalesce loop. Need to rationalise this file
|
2019-06-04 23:49:29 +01:00 |
|
Peter Boyle
|
b47f73c222
|
GPU happy
|
2019-06-04 21:30:39 +01:00 |
|
Peter Boyle
|
5720ced0fd
|
Simplifying
|
2019-06-04 21:30:08 +01:00 |
|
Peter Boyle
|
2c87b56b53
|
Making GPU happier
|
2019-06-04 21:29:44 +01:00 |
|
Peter Boyle
|
dbad48d802
|
Remove Ls vectorised DWF
|
2019-06-04 21:27:40 +01:00 |
|
Peter Boyle
|
4557a1365a
|
Remove Ls vectorised DWF
|
2019-06-04 20:59:59 +01:00 |
|