Peter Boyle
|
e03b64dc06
|
HIP default flaags to work on ROCM
|
2020-09-16 00:33:09 +01:00 |
|
Peter Boyle
|
4677c40195
|
HIP improvements
|
2020-09-16 00:32:27 +01:00 |
|
Peter Boyle
|
288c615782
|
Hip improvements
|
2020-09-16 00:31:50 +01:00 |
|
Peter Boyle
|
48e81cf6f8
|
Hip Pragmas
|
2020-09-16 00:31:03 +01:00 |
|
Peter Boyle
|
65b724bb5f
|
2 level hddcr
|
2020-09-03 21:46:43 -04:00 |
|
Peter Boyle
|
6dbd117aa5
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2020-09-03 20:30:49 -04:00 |
|
Peter Boyle
|
198b29f618
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2020-09-03 20:29:54 -04:00 |
|
Peter Boyle
|
a8309638d4
|
UVM check in MPI calls
|
2020-09-03 20:29:26 -04:00 |
|
Peter Boyle
|
f98a4e880e
|
Merge pull request #310 from kostrzewa/accelerator_vector_stream_op_no_backspace
do not use backspace in AcceleratorVector (Coordinate) output stream operator
|
2020-09-03 20:24:59 -04:00 |
|
Peter Boyle
|
8244caff25
|
Remove the asynchronous non-Stencil calls.
|
2020-09-03 18:52:55 -04:00 |
|
Peter Boyle
|
bcd7895362
|
Include cuda.h
|
2020-09-03 15:49:13 -04:00 |
|
Peter Boyle
|
85b1c5df39
|
A never hit case that is not 100% confident is asserted for safety
|
2020-09-03 15:48:16 -04:00 |
|
Peter Boyle
|
b4255140d6
|
Stale data member eliminated
|
2020-09-03 15:47:46 -04:00 |
|
Peter Boyle
|
0c3095e173
|
Comms buffers to device memory
|
2020-09-03 15:45:35 -04:00 |
|
Peter Boyle
|
d3ce60713d
|
UVM, Device and Lattice/aligned allocators
|
2020-09-03 15:44:13 -04:00 |
|
Peter Boyle
|
eac1f08b7b
|
Close expressions passed as an argument
|
2020-09-01 15:30:33 -04:00 |
|
Peter Boyle
|
1654c4f3c0
|
Closure improved
|
2020-09-01 15:29:45 -04:00 |
|
Peter Boyle
|
8807d998bc
|
closure improved
|
2020-09-01 15:29:11 -04:00 |
|
Peter Boyle
|
5791021dcd
|
Speed up Cshift more with coalesced
|
2020-09-01 15:28:15 -04:00 |
|
Peter Boyle
|
c273fb051c
|
Peek poke laattice
|
2020-09-01 15:27:59 -04:00 |
|
Peter Boyle
|
c545530170
|
little worry large Nbasis doesnt compile GPU
|
2020-09-01 00:14:33 -04:00 |
|
Peter Boyle
|
d982a5b6d5
|
Fix coaarsened
|
2020-09-01 00:14:04 -04:00 |
|
Peter Boyle
|
15ca8637f3
|
No norms in HermOp
|
2020-09-01 00:13:32 -04:00 |
|
Peter Boyle
|
cbc995b74c
|
Made better interface
|
2020-09-01 00:12:54 -04:00 |
|
Peter Boyle
|
8b74174d74
|
Eigen tensor serialisatiino happy undeer GPU. Regret agreeing to let us couple Eigen types to Grid IO
|
2020-09-01 00:03:26 -04:00 |
|
Peter Boyle
|
e21fef17df
|
real and imag part not in ET
|
2020-08-31 23:56:26 -04:00 |
|
Peter Boyle
|
3d27708f07
|
Basic where test
|
2020-08-31 23:55:49 -04:00 |
|
Peter Boyle
|
b918744184
|
Prettificatoin
|
2020-08-31 23:54:46 -04:00 |
|
Peter Boyle
|
7d14a3c086
|
Where working
|
2020-08-31 23:53:46 -04:00 |
|
Peter Boyle
|
e14a84317d
|
GPU math unary calls
|
2020-08-31 23:50:49 -04:00 |
|
Peter Boyle
|
6c31b99f1f
|
I knew coupling Eigen Tensor to Grid serialisation was a bad iddea.
Now the complex is different on GPU creates probblems
|
2020-08-31 23:49:19 -04:00 |
|
Peter Boyle
|
9522dcd611
|
Remove dead commented ouot coode
|
2020-08-31 23:40:29 -04:00 |
|
Peter Boyle
|
ed469898dc
|
coalesced ET expressions
|
2020-08-31 23:38:40 -04:00 |
|
Peter Boyle
|
1eee94a809
|
Sorting real/im in read coalesced GPU ET
|
2020-08-31 23:36:49 -04:00 |
|
Bartosz Kostrzewa
|
54523369a3
|
do not use backspace in Coordinate output stream operator
|
2020-08-31 19:39:36 +02:00 |
|
Peter Boyle
|
a98c91c2a5
|
Merge pull request #309 from kostrzewa/format_benchmark_wilson_sweep
Format benchmark wilson sweep
|
2020-08-31 12:43:46 -04:00 |
|
Bartosz Kostrzewa
|
a9b92867a8
|
use tabulator
|
2020-08-31 18:41:17 +02:00 |
|
Bartosz Kostrzewa
|
65920faeba
|
correct formatting of Benchmark_wilson_sweep output
|
2020-08-31 18:39:27 +02:00 |
|
Peter Boyle
|
3448b7387c
|
Almost there to coalesced ET
|
2020-08-26 17:04:49 -04:00 |
|
Peter Boyle
|
47b89d2739
|
Pragma protection improvementt
|
2020-08-26 17:04:27 -04:00 |
|
Peter Boyle
|
1efe30d6cc
|
SLurm stop nodes using same GPU
|
2020-08-21 02:02:53 +02:00 |
|
Peter Boyle
|
0b787e9fe0
|
Avoid namespaec collision to make gcc happy
|
2020-08-20 22:23:29 +02:00 |
|
Peter Boyle
|
37ec4b241c
|
Default thread count sensible
|
2020-08-20 22:12:31 +02:00 |
|
Peter Boyle
|
90ea7dfa99
|
Accelerator loops for device resident comms buf
|
2020-08-19 22:40:44 +02:00 |
|
Peter Boyle
|
f866d7c33e
|
Merge pull request #307 from lehner/feature/gpt
Merged Nils's A64FX and minor fixes (MemoryManager::InitMessage, Tensor_index zeroit, ...)
|
2020-08-18 23:27:21 -04:00 |
|
Christoph Lehner
|
542bdef198
|
cleanup comments
|
2020-08-14 18:39:44 +02:00 |
|
Christoph Lehner
|
06007db3d9
|
true shm_none implementation with GPUs that disables the use of device shared memory for the stencils
|
2020-08-14 18:37:00 +02:00 |
|
Christoph Lehner
|
12e6059a70
|
Merge branch 'feature/gpt' of https://github.com/lehner/Grid into feature/gpt
|
2020-08-13 16:16:52 +02:00 |
|
Christoph Lehner
|
dbaa24ebf6
|
further GPU memory access fixes (with this GPT passes all single-rank tests on non-summit GPUs)
|
2020-08-13 16:14:15 +02:00 |
|
Peter Boyle
|
3276aa67dc
|
Update
|
2020-08-12 14:15:53 -04:00 |
|