1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-09-20 01:05:38 +01:00
Commit Graph

1568 Commits

Author SHA1 Message Date
Peter Boyle
b461184797 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2024-07-23 09:53:58 -04:00
Peter Boyle
486412635a 8^4 test for PETSc 2024-07-22 15:25:17 -04:00
Peter Boyle
8b23a1546a Force compile temporarily 2024-07-22 15:24:56 -04:00
Peter Boyle
a901e4e369 Regressed performance for paper 2024-07-22 15:24:04 -04:00
Peter Boyle
804d9367d4 Regressed performance 2024-07-22 15:23:25 -04:00
Peter Boyle
41d8adca95 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2024-07-11 15:38:45 +00:00
Peter Boyle
7c246606c1 Schur additional case 2024-07-10 22:04:32 +00:00
Peter Boyle
12b8be7cb9 Best so far on 96^3 350 Evecs converged on 4^4 block 2024-06-18 16:31:37 -04:00
Peter Boyle
b5926c1d21 Broadcast time info 2024-06-11 15:16:25 -04:00
Peter Boyle
dc80b08969 96^3 test 2024-06-10 15:07:29 -04:00
Peter Boyle
0e607a55e7 Updated for 8^4 test 2024-05-26 20:53:05 +00:00
Peter Boyle
ad14a82742 Working aas good as possible on 48^3 in double 2024-05-16 10:55:45 -04:00
Peter Boyle
5c3ace7c3e Merge branch 'develop' into feature/scidac-wp1 2024-04-30 05:26:06 -04:00
Peter Boyle
98cf247f33 prepare to switch to mixed precision 2024-04-30 05:23:45 -04:00
Peter Boyle
0cf16522d1 Refine with HDCG choice 2024-04-30 05:22:14 -04:00
Peter Boyle
5147a42818 Updated hdcg 2024-04-05 01:05:57 -04:00
Peter Boyle
5b79d51c22 Improvements 2024-04-01 14:18:40 -04:00
Peter Boyle
59b0cc11df REduce the time in single 2024-03-26 00:42:40 +00:00
Peter Boyle
d01e5fa838 Improved FlightRecorder 2024-03-22 15:42:32 +00:00
Peter Boyle
fab1efb48c More britney logging improvements 2024-03-19 14:36:21 +00:00
2704b82084 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2024-03-12 15:16:24 +00:00
cf8632bbac Britney test option 2024-03-12 15:15:35 +00:00
2b4399f8b1 more HOST_NAME_MAX fix 2024-03-07 15:26:01 +09:00
Peter Boyle
cc04dc42dc Merge branch 'develop' into feature/scidac-wp1 2024-03-06 14:55:21 -05:00
Peter Boyle
070b61f08f Simplifying the MultiRHS solver to make it do SRHS *and* MRHS 2024-03-06 14:04:33 -05:00
9b5f741e85 Reproducing CG can be more useful now 2024-03-06 00:03:16 +00:00
Peter Boyle
436bf1d9d3
Merge pull request #455 from clarkedavida/hisq_fat_links
Hisq fat links
2024-02-29 15:29:39 -05:00
Peter Boyle
cd15abe9d1 Mrhs prep 2024-02-27 11:41:13 -05:00
Dennis Bollweg
b507fe209c Added SpinColourMatrix case to sliceSum Test 2024-02-27 11:28:32 -05:00
david clarke
94581e3c7a accelerator_for is broken 2024-02-23 15:58:33 -07:00
Dennis Bollweg
15878f7613 sliceSumReduction_cub_large now also faster than CPU on Frontier 2024-02-16 13:55:21 -05:00
dbollweg
6f3455900e Adding sliceSumReduction_cub_small/large since hipcub cannot deal with arb. large vobjs 2024-02-16 13:15:02 -05:00
dbollweg
b5659d106e more test cases 2024-02-09 13:37:14 -05:00
dbollweg
9514035b87 refactor slicesum: slicesum uses GPU version by default now 2024-02-09 13:02:28 -05:00
dbollweg
ab2de131bd work towards sliceSum for sycl backend 2024-02-06 13:24:45 -05:00
Dennis Bollweg
b8b9dc952d Async memcpy's and cleanup 2024-02-01 17:55:35 -05:00
Dennis Bollweg
79a6ed32d8 Use accelerator_for2d and DeviceSegmentedRecude to avoid kernel launch latencies 2024-02-01 16:41:03 -05:00
dbollweg
caa5f97723 Add sliceSum gpu using cub/hipcub 2024-01-31 16:50:06 -05:00
david clarke
4924b3209e projectU3 yields a unitary matrix 2024-01-23 14:43:58 -07:00
Peter Boyle
eb702f581b Running on 12 rhs on 18 nodes of frontier 2024-01-22 17:44:15 -05:00
david clarke
f5b3d582b0 first attempt at U3 projection 2024-01-22 02:49:40 -07:00
david clarke
981c93d67a update Test_fatLinks to accept Naik 2024-01-21 21:09:19 -07:00
Peter Boyle
d967eb53de Working for first time 2024-01-17 16:31:12 -05:00
Peter Boyle
25f71913b7 MultiRHS coarse 2024-01-04 12:01:17 -05:00
Peter Boyle
d5fd90b2f3 Add 48^3 rtest 2024-01-04 12:00:01 -05:00
Peter Boyle
22c611bd1a Delete temp file 2023-12-21 18:32:31 -05:00
Peter Boyle
c9bb1bf8ea Passing new BLAs based 2023-12-21 18:31:17 -05:00
Peter Boyle
9e489887cf General coarse multiRHS move to BLAS implementation 2023-12-21 15:24:48 -05:00
Peter Boyle
abcd6b8cb6 Faster version 2023-12-19 15:17:46 -05:00
Peter Boyle
6835a7f208 Better logging, test on 81 point stencil 2023-11-29 19:20:47 -05:00