936c5ecf69
Reduction GPU no compile fix
2020-06-24 17:28:31 -04:00
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
28a1fcaaff
First compile against SYCL
2020-05-05 11:13:27 -07:00
ddb192bac7
re-work double precision promotion for summit
2020-04-30 16:09:57 -04:00
f1fe444d4f
blocked precision promotion infrastructure upgrade
2020-04-24 06:27:20 -04:00
091d5c605e
towards more precise blocking
2020-04-17 04:25:28 -04:00
b473405652
Tensor ambiguous fix
2019-08-29 09:36:41 -05:00
28d6be2a4e
Fix GCC complaint
2019-08-22 18:56:37 +01:00
96ac56cace
Double precision variants for summation accuracy
2019-08-14 13:08:01 +01:00
a23dc295ac
Remove compiler errors and warnings
2019-07-18 14:47:02 +01:00
08904f830e
Merge develop
2019-07-16 11:59:56 +01:00
fa9cd50c5b
Merge branch 'develop' into feature/gpu-port
2019-07-16 11:55:17 +01:00
d6ffadb33b
Coalesced write
2019-07-02 17:25:13 +01:00
b8f7bfbb26
Dont stream as poor perf in some cases
2019-07-01 07:30:25 +01:00
d976e5c514
Pow is being awkward in thrust for reasons I don't understand. Possible thrust bug.
2019-06-16 12:05:11 +01:00
b285138be4
Better checking on types
2019-06-15 08:27:48 +01:00
29a244e423
Test of using a lane variable instead of repeated reference to threadIdx.y
2019-06-08 13:46:26 +01:00
0ee6e77cbc
Compiles GPU and CPU, still gives good performance on CPU
2019-06-05 13:28:16 +01:00
8794d35c78
GPU
2019-06-04 20:52:27 +01:00
6e2e904a0e
NVCC compiles happy. Start to develop strategy for writing generic
...
code for GPU kernels and CPU kernels.
2019-06-04 20:46:35 +01:00
ffde81f22a
Nsimd() and coalesced support
2019-05-25 12:44:07 +01:00
d8098f1ecd
coalesced support
2019-05-25 12:43:31 +01:00
12d8bf1ced
Eigen::Tensor serialisation. Tested on single and double precision builds
2019-03-20 22:27:41 +00:00
91cffef883
Updates after review with Peter.
2019-03-07 14:30:35 +00:00
b7db99967a
Recommendations for Traits classes
2019-02-28 20:06:59 +00:00
e73b909a48
Make tests running past nvcc. Different NVCC versions proving tricky to keep happy. This is 9.2
2019-01-02 12:05:30 +00:00
8c91e82ee8
GPU clean up, remove parallel_for. Split into accelerator_loop, thread_loop
...
cases, and collides with parallel_for in thrust
2019-01-01 15:06:46 +00:00
422764757d
Updates in tests to make all of Grid compile
2018-12-14 16:55:54 +00:00
b57a4d32aa
Merge branch 'develop' into feature/gpu-port
2018-12-13 05:11:34 +00:00
fb7d021b9d
Hadrons: moving Hadrons to root directory, build system improvements
2018-08-28 15:00:40 +01:00