Peter Boyle
|
dc80b08969
|
96^3 test
|
2024-06-10 15:07:29 -04:00 |
|
Peter Boyle
|
0e607a55e7
|
Updated for 8^4 test
|
2024-05-26 20:53:05 +00:00 |
|
Peter Boyle
|
ad14a82742
|
Working aas good as possible on 48^3 in double
|
2024-05-16 10:55:45 -04:00 |
|
Peter Boyle
|
5c3ace7c3e
|
Merge branch 'develop' into feature/scidac-wp1
|
2024-04-30 05:26:06 -04:00 |
|
Peter Boyle
|
98cf247f33
|
prepare to switch to mixed precision
|
2024-04-30 05:23:45 -04:00 |
|
Peter Boyle
|
0cf16522d1
|
Refine with HDCG choice
|
2024-04-30 05:22:14 -04:00 |
|
Peter Boyle
|
5147a42818
|
Updated hdcg
|
2024-04-05 01:05:57 -04:00 |
|
Peter Boyle
|
5b79d51c22
|
Improvements
|
2024-04-01 14:18:40 -04:00 |
|
Peter Boyle
|
59b0cc11df
|
REduce the time in single
|
2024-03-26 00:42:40 +00:00 |
|
Peter Boyle
|
d01e5fa838
|
Improved FlightRecorder
|
2024-03-22 15:42:32 +00:00 |
|
Peter Boyle
|
fab1efb48c
|
More britney logging improvements
|
2024-03-19 14:36:21 +00:00 |
|
|
2704b82084
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2024-03-12 15:16:24 +00:00 |
|
|
cf8632bbac
|
Britney test option
|
2024-03-12 15:15:35 +00:00 |
|
|
2b4399f8b1
|
more HOST_NAME_MAX fix
|
2024-03-07 15:26:01 +09:00 |
|
Peter Boyle
|
cc04dc42dc
|
Merge branch 'develop' into feature/scidac-wp1
|
2024-03-06 14:55:21 -05:00 |
|
Peter Boyle
|
070b61f08f
|
Simplifying the MultiRHS solver to make it do SRHS *and* MRHS
|
2024-03-06 14:04:33 -05:00 |
|
|
9b5f741e85
|
Reproducing CG can be more useful now
|
2024-03-06 00:03:16 +00:00 |
|
Peter Boyle
|
436bf1d9d3
|
Merge pull request #455 from clarkedavida/hisq_fat_links
Hisq fat links
|
2024-02-29 15:29:39 -05:00 |
|
Peter Boyle
|
cd15abe9d1
|
Mrhs prep
|
2024-02-27 11:41:13 -05:00 |
|
Dennis Bollweg
|
b507fe209c
|
Added SpinColourMatrix case to sliceSum Test
|
2024-02-27 11:28:32 -05:00 |
|
david clarke
|
94581e3c7a
|
accelerator_for is broken
|
2024-02-23 15:58:33 -07:00 |
|
Dennis Bollweg
|
15878f7613
|
sliceSumReduction_cub_large now also faster than CPU on Frontier
|
2024-02-16 13:55:21 -05:00 |
|
dbollweg
|
6f3455900e
|
Adding sliceSumReduction_cub_small/large since hipcub cannot deal with arb. large vobjs
|
2024-02-16 13:15:02 -05:00 |
|
dbollweg
|
b5659d106e
|
more test cases
|
2024-02-09 13:37:14 -05:00 |
|
dbollweg
|
9514035b87
|
refactor slicesum: slicesum uses GPU version by default now
|
2024-02-09 13:02:28 -05:00 |
|
dbollweg
|
ab2de131bd
|
work towards sliceSum for sycl backend
|
2024-02-06 13:24:45 -05:00 |
|
Dennis Bollweg
|
b8b9dc952d
|
Async memcpy's and cleanup
|
2024-02-01 17:55:35 -05:00 |
|
Dennis Bollweg
|
79a6ed32d8
|
Use accelerator_for2d and DeviceSegmentedRecude to avoid kernel launch latencies
|
2024-02-01 16:41:03 -05:00 |
|
dbollweg
|
caa5f97723
|
Add sliceSum gpu using cub/hipcub
|
2024-01-31 16:50:06 -05:00 |
|
david clarke
|
4924b3209e
|
projectU3 yields a unitary matrix
|
2024-01-23 14:43:58 -07:00 |
|
Peter Boyle
|
eb702f581b
|
Running on 12 rhs on 18 nodes of frontier
|
2024-01-22 17:44:15 -05:00 |
|
david clarke
|
f5b3d582b0
|
first attempt at U3 projection
|
2024-01-22 02:49:40 -07:00 |
|
david clarke
|
981c93d67a
|
update Test_fatLinks to accept Naik
|
2024-01-21 21:09:19 -07:00 |
|
Peter Boyle
|
d967eb53de
|
Working for first time
|
2024-01-17 16:31:12 -05:00 |
|
Peter Boyle
|
25f71913b7
|
MultiRHS coarse
|
2024-01-04 12:01:17 -05:00 |
|
Peter Boyle
|
d5fd90b2f3
|
Add 48^3 rtest
|
2024-01-04 12:00:01 -05:00 |
|
Peter Boyle
|
22c611bd1a
|
Delete temp file
|
2023-12-21 18:32:31 -05:00 |
|
Peter Boyle
|
c9bb1bf8ea
|
Passing new BLAs based
|
2023-12-21 18:31:17 -05:00 |
|
Peter Boyle
|
9e489887cf
|
General coarse multiRHS move to BLAS implementation
|
2023-12-21 15:24:48 -05:00 |
|
Peter Boyle
|
abcd6b8cb6
|
Faster version
|
2023-12-19 15:17:46 -05:00 |
|
Peter Boyle
|
6835a7f208
|
Better logging, test on 81 point stencil
|
2023-11-29 19:20:47 -05:00 |
|
Peter Boyle
|
f59993b979
|
Nbasis§
|
2023-11-29 09:47:36 -05:00 |
|
Peter Boyle
|
e859a199df
|
Reduce volume to interior for coarse stencil -- worth up to 4x gain
|
2023-11-28 10:23:16 -05:00 |
|
Peter Boyle
|
0a3682ad0b
|
MultiRHS work
|
2023-11-28 07:43:37 -05:00 |
|
Peter Boyle
|
59abaeb5cd
|
Time stamp
|
2023-11-24 12:56:45 -05:00 |
|
Peter Boyle
|
b302ad3d49
|
multiRHS test in place, passes Yay!
|
2023-11-23 18:20:15 -05:00 |
|
Peter Boyle
|
09946cf1ba
|
Improved, works on 48^3 moving to multiRHS optimisations
|
2023-11-15 18:03:05 -05:00 |
|
david clarke
|
9cd4128833
|
fix naik bug
|
2023-11-03 14:11:38 -06:00 |
|
david clarke
|
df9b958c40
|
naik now returns separately
|
2023-10-30 17:40:53 -06:00 |
|
david clarke
|
3d3376d1a3
|
LePage works, trying Naik
|
2023-10-27 16:26:31 -06:00 |
|