|
b728af903c
|
Fast axpy norm under CFLAG
|
2024-10-11 03:23:09 +00:00 |
|
|
54f1999030
|
axpy_norm_fast -- wasn't using the determinstic MPI sum causing issues
|
2024-10-11 03:22:18 +00:00 |
|
|
fd58f0b669
|
Return ok
|
2024-10-11 03:21:21 +00:00 |
|
|
c5c67b706e
|
cl::sycl -> SYCL
|
2024-10-10 22:04:12 +00:00 |
|
|
be7a543e2c
|
Revert barriers -- these were not the problem
|
2024-10-10 22:03:29 +00:00 |
|
|
68f112d576
|
New software moves cl::sycl
|
2024-10-10 22:03:04 +00:00 |
|
|
ec1395a304
|
Better flight logging
|
2024-10-10 22:01:57 +00:00 |
|
|
beb0e474ee
|
Use deterministic own brand reduction
|
2024-10-10 22:01:24 +00:00 |
|
|
2b5fdcbbc5
|
New software version
|
2024-10-10 21:59:02 +00:00 |
|
|
295127d456
|
Deterministic homebrew reduction
|
2024-10-10 21:58:26 +00:00 |
|
|
7dcfb13694
|
New software stack
|
2024-10-10 21:57:35 +00:00 |
|
|
9fa8bd6438
|
Configure for AOT on Aurora latest software
|
2024-09-23 11:25:44 +00:00 |
|
|
02c8178f16
|
Almost working on Aurora
|
2024-09-23 09:43:50 +00:00 |
|
|
e637fbacae
|
Verbose remove
|
2024-09-23 09:42:43 +00:00 |
|
|
066544281f
|
Deprecate UVM
|
2024-09-17 13:34:27 +00:00 |
|
|
11be10d2c0
|
Aurora testing
|
2024-09-10 18:11:52 +00:00 |
|
|
160969a758
|
UVM tester, doesn't turn up anything
|
2024-09-10 18:09:42 +00:00 |
|
|
622f78ebea
|
SYCL updates -- operator = giving trouble on Aurora.
SYCL reduction is failing intermittently with SVM interface - returns
zero, expect non-zero.
Think I need to remove ALL dependence on SVM.
|
2024-09-04 13:53:48 +00:00 |
|
Peter Boyle
|
aa67a5b095
|
Rename
|
2024-08-27 19:54:01 +00:00 |
|
Peter Boyle
|
af9ea0864c
|
Blas fix
|
2024-08-27 19:53:09 +00:00 |
|
Peter Boyle
|
4e2a6d87c4
|
Gemm batched fix
|
2024-08-27 19:24:05 +00:00 |
|
Peter Boyle
|
a465ecece9
|
Aurora
|
2024-08-27 19:20:43 +00:00 |
|
Peter Boyle
|
575eb72182
|
Converges on 16^3
|
2024-08-27 19:20:38 +00:00 |
|
Peter Boyle
|
3a973914d6
|
Compile on frontier
|
2024-08-27 14:55:42 -04:00 |
|
Peter Boyle
|
f568c07bbd
|
Improved the BLAS benchmark
|
2024-08-27 14:53:54 -04:00 |
|
Peter Boyle
|
2c9878fc3a
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2024-08-27 12:05:46 -04:00 |
|
Peter Boyle
|
27b1b1b005
|
Checkerboard available for offloading pickCheckerboard
|
2024-08-27 12:04:09 -04:00 |
|
Peter Boyle
|
130d7ab077
|
Verbose changes
|
2024-08-27 12:03:28 -04:00 |
|
Peter Boyle
|
29f6b8a74a
|
Setup
|
2024-08-27 12:02:49 -04:00 |
|
Peter Boyle
|
9779aaea33
|
16^3 optimise
|
2024-08-27 11:38:35 -04:00 |
|
Peter Boyle
|
ec25604a67
|
Fastest solver for mrhs multigrid
|
2024-08-27 11:32:34 -04:00 |
|
Peter Boyle
|
3668e81c5e
|
Extract slice working on checkerboard field for Block Lanczos
|
2024-08-27 11:31:30 -04:00 |
|
Peter Boyle
|
d66b2423cb
|
Move slice operations to GPU for BlockCG
|
2024-08-27 11:28:47 -04:00 |
|
Peter Boyle
|
15cc78f0b6
|
peek/poke local site on checkerboard arrays
|
2024-08-27 11:23:42 -04:00 |
|
Peter Boyle
|
06db4ddea2
|
Fast init on GPU
|
2024-08-27 11:22:33 -04:00 |
|
Peter Boyle
|
6cfb90e99f
|
Support needed for accelerator resident set/pick Checkerboard
|
2024-08-27 11:19:00 -04:00 |
|
Peter Boyle
|
d8be95a2a3
|
Don't early terminate power method to get more accurate top EV
|
2024-08-27 11:17:37 -04:00 |
|
Peter Boyle
|
f82702872d
|
Normal residual
|
2024-08-27 11:16:44 -04:00 |
|
Peter Boyle
|
3752c49ef0
|
Add option to record the CG polynomial
|
2024-08-27 11:14:35 -04:00 |
|
Peter Boyle
|
fe65fa4988
|
MulMatrix
|
2024-08-27 11:13:18 -04:00 |
|
Peter Boyle
|
1fe4c205a3
|
Adef
|
2024-08-27 11:11:47 -04:00 |
|
Peter Boyle
|
d4dc5e0f43
|
BlockCG linalg acceleratoin with BLAS
|
2024-08-27 11:08:33 -04:00 |
|
Peter Boyle
|
77944437ce
|
Functor initialisation
|
2024-08-27 11:01:02 -04:00 |
|
Peter Boyle
|
c164bff758
|
MMdag
|
2024-08-27 11:00:36 -04:00 |
|
Peter Boyle
|
aa2e3d954a
|
MMdag operator
|
2024-08-27 10:59:29 -04:00 |
|
Peter Boyle
|
de62b04728
|
Block CG linalg acceleration
|
2024-08-27 10:58:54 -04:00 |
|
Peter Boyle
|
d0bdb50f24
|
Analyse power spectrum
|
2024-08-27 10:58:19 -04:00 |
|
Peter Boyle
|
a8fecbc609
|
BlockCG linalg via BLAS
|
2024-08-21 16:08:16 -04:00 |
|
Peter Boyle
|
557fa483ff
|
Blas benchmark committed stand alone
|
2024-08-20 16:18:43 +00:00 |
|
Peter Boyle
|
fc15d55df6
|
Mallinfo
|
2024-08-20 14:33:09 +00:00 |
|