1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-24 01:34:47 +01:00
Commit Graph

1559 Commits

Author SHA1 Message Date
Peter Boyle
dc80b08969 96^3 test 2024-06-10 15:07:29 -04:00
Peter Boyle
0e607a55e7 Updated for 8^4 test 2024-05-26 20:53:05 +00:00
Peter Boyle
ad14a82742 Working aas good as possible on 48^3 in double 2024-05-16 10:55:45 -04:00
Peter Boyle
5c3ace7c3e Merge branch 'develop' into feature/scidac-wp1 2024-04-30 05:26:06 -04:00
Peter Boyle
98cf247f33 prepare to switch to mixed precision 2024-04-30 05:23:45 -04:00
Peter Boyle
0cf16522d1 Refine with HDCG choice 2024-04-30 05:22:14 -04:00
Peter Boyle
5147a42818 Updated hdcg 2024-04-05 01:05:57 -04:00
Peter Boyle
5b79d51c22 Improvements 2024-04-01 14:18:40 -04:00
Peter Boyle
59b0cc11df REduce the time in single 2024-03-26 00:42:40 +00:00
Peter Boyle
d01e5fa838 Improved FlightRecorder 2024-03-22 15:42:32 +00:00
Peter Boyle
fab1efb48c More britney logging improvements 2024-03-19 14:36:21 +00:00
2704b82084 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2024-03-12 15:16:24 +00:00
cf8632bbac Britney test option 2024-03-12 15:15:35 +00:00
2b4399f8b1 more HOST_NAME_MAX fix 2024-03-07 15:26:01 +09:00
Peter Boyle
cc04dc42dc Merge branch 'develop' into feature/scidac-wp1 2024-03-06 14:55:21 -05:00
Peter Boyle
070b61f08f Simplifying the MultiRHS solver to make it do SRHS *and* MRHS 2024-03-06 14:04:33 -05:00
9b5f741e85 Reproducing CG can be more useful now 2024-03-06 00:03:16 +00:00
Peter Boyle
436bf1d9d3 Merge pull request #455 from clarkedavida/hisq_fat_links
Hisq fat links
2024-02-29 15:29:39 -05:00
Peter Boyle
cd15abe9d1 Mrhs prep 2024-02-27 11:41:13 -05:00
Dennis Bollweg
b507fe209c Added SpinColourMatrix case to sliceSum Test 2024-02-27 11:28:32 -05:00
david clarke
94581e3c7a accelerator_for is broken 2024-02-23 15:58:33 -07:00
Dennis Bollweg
15878f7613 sliceSumReduction_cub_large now also faster than CPU on Frontier 2024-02-16 13:55:21 -05:00
dbollweg
6f3455900e Adding sliceSumReduction_cub_small/large since hipcub cannot deal with arb. large vobjs 2024-02-16 13:15:02 -05:00
dbollweg
b5659d106e more test cases 2024-02-09 13:37:14 -05:00
dbollweg
9514035b87 refactor slicesum: slicesum uses GPU version by default now 2024-02-09 13:02:28 -05:00
dbollweg
ab2de131bd work towards sliceSum for sycl backend 2024-02-06 13:24:45 -05:00
Dennis Bollweg
b8b9dc952d Async memcpy's and cleanup 2024-02-01 17:55:35 -05:00
Dennis Bollweg
79a6ed32d8 Use accelerator_for2d and DeviceSegmentedRecude to avoid kernel launch latencies 2024-02-01 16:41:03 -05:00
dbollweg
caa5f97723 Add sliceSum gpu using cub/hipcub 2024-01-31 16:50:06 -05:00
david clarke
4924b3209e projectU3 yields a unitary matrix 2024-01-23 14:43:58 -07:00
Peter Boyle
eb702f581b Running on 12 rhs on 18 nodes of frontier 2024-01-22 17:44:15 -05:00
david clarke
f5b3d582b0 first attempt at U3 projection 2024-01-22 02:49:40 -07:00
david clarke
981c93d67a update Test_fatLinks to accept Naik 2024-01-21 21:09:19 -07:00
Peter Boyle
d967eb53de Working for first time 2024-01-17 16:31:12 -05:00
Peter Boyle
25f71913b7 MultiRHS coarse 2024-01-04 12:01:17 -05:00
Peter Boyle
d5fd90b2f3 Add 48^3 rtest 2024-01-04 12:00:01 -05:00
Peter Boyle
22c611bd1a Delete temp file 2023-12-21 18:32:31 -05:00
Peter Boyle
c9bb1bf8ea Passing new BLAs based 2023-12-21 18:31:17 -05:00
Peter Boyle
9e489887cf General coarse multiRHS move to BLAS implementation 2023-12-21 15:24:48 -05:00
Peter Boyle
abcd6b8cb6 Faster version 2023-12-19 15:17:46 -05:00
Peter Boyle
6835a7f208 Better logging, test on 81 point stencil 2023-11-29 19:20:47 -05:00
Peter Boyle
f59993b979 Nbasis§ 2023-11-29 09:47:36 -05:00
Peter Boyle
e859a199df Reduce volume to interior for coarse stencil -- worth up to 4x gain 2023-11-28 10:23:16 -05:00
Peter Boyle
0a3682ad0b MultiRHS work 2023-11-28 07:43:37 -05:00
Peter Boyle
59abaeb5cd Time stamp 2023-11-24 12:56:45 -05:00
Peter Boyle
b302ad3d49 multiRHS test in place, passes Yay! 2023-11-23 18:20:15 -05:00
Peter Boyle
09946cf1ba Improved, works on 48^3 moving to multiRHS optimisations 2023-11-15 18:03:05 -05:00
david clarke
9cd4128833 fix naik bug 2023-11-03 14:11:38 -06:00
david clarke
df9b958c40 naik now returns separately 2023-10-30 17:40:53 -06:00
david clarke
3d3376d1a3 LePage works, trying Naik 2023-10-27 16:26:31 -06:00