1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-25 10:09:34 +01:00
Commit Graph

1584 Commits

Author SHA1 Message Date
Dennis Bollweg
b8b9dc952d Async memcpy's and cleanup 2024-02-01 17:55:35 -05:00
Dennis Bollweg
79a6ed32d8 Use accelerator_for2d and DeviceSegmentedRecude to avoid kernel launch latencies 2024-02-01 16:41:03 -05:00
dbollweg
caa5f97723 Add sliceSum gpu using cub/hipcub 2024-01-31 16:50:06 -05:00
david clarke
4924b3209e projectU3 yields a unitary matrix 2024-01-23 14:43:58 -07:00
Peter Boyle
eb702f581b Running on 12 rhs on 18 nodes of frontier 2024-01-22 17:44:15 -05:00
david clarke
f5b3d582b0 first attempt at U3 projection 2024-01-22 02:49:40 -07:00
david clarke
981c93d67a update Test_fatLinks to accept Naik 2024-01-21 21:09:19 -07:00
Peter Boyle
d967eb53de Working for first time 2024-01-17 16:31:12 -05:00
Peter Boyle
25f71913b7 MultiRHS coarse 2024-01-04 12:01:17 -05:00
Peter Boyle
d5fd90b2f3 Add 48^3 rtest 2024-01-04 12:00:01 -05:00
Peter Boyle
22c611bd1a Delete temp file 2023-12-21 18:32:31 -05:00
Peter Boyle
c9bb1bf8ea Passing new BLAs based 2023-12-21 18:31:17 -05:00
Peter Boyle
9e489887cf General coarse multiRHS move to BLAS implementation 2023-12-21 15:24:48 -05:00
Peter Boyle
abcd6b8cb6 Faster version 2023-12-19 15:17:46 -05:00
Peter Boyle
6835a7f208 Better logging, test on 81 point stencil 2023-11-29 19:20:47 -05:00
Peter Boyle
f59993b979 Nbasis§ 2023-11-29 09:47:36 -05:00
Peter Boyle
e859a199df Reduce volume to interior for coarse stencil -- worth up to 4x gain 2023-11-28 10:23:16 -05:00
Peter Boyle
0a3682ad0b MultiRHS work 2023-11-28 07:43:37 -05:00
Peter Boyle
59abaeb5cd Time stamp 2023-11-24 12:56:45 -05:00
Peter Boyle
b302ad3d49 multiRHS test in place, passes Yay! 2023-11-23 18:20:15 -05:00
Peter Boyle
09946cf1ba Improved, works on 48^3 moving to multiRHS optimisations 2023-11-15 18:03:05 -05:00
david clarke
9cd4128833 fix naik bug 2023-11-03 14:11:38 -06:00
david clarke
df9b958c40 naik now returns separately 2023-10-30 17:40:53 -06:00
david clarke
3d3376d1a3 LePage works, trying Naik 2023-10-27 16:26:31 -06:00
Peter Boyle
9c9c42d0df Tests on frontier with real speed up . 3.5x on 16^3 at mq=0.01 2023-10-20 19:27:13 -04:00
Peter Boyle
0ae4478cd9 Checkpoint the subspace and ldop 2023-10-20 19:27:13 -04:00
Peter Boyle
ae4e705e09 Use random vec as easier for debug 2023-10-20 19:27:13 -04:00
david clarke
21ed6ac0f4 added floating-point support 2023-10-20 13:54:26 -06:00
david clarke
7bb8ab7000 improve smearing templating 2023-10-20 08:41:02 -06:00
david clarke
391fd9cc6a try lepage term 2023-10-17 14:57:15 -06:00
david clarke
36600899e2 working 7-link; Grid_log; generalShift 2023-10-12 11:11:39 -06:00
david clarke
b9c70d156b Merge branch 'develop' into hisq_fat_links 2023-10-10 22:44:17 -06:00
david clarke
eb89579fe7 Merge remote-tracking branch 'origin/develop' into develop 2023-10-10 22:43:51 -06:00
david clarke
0cfd13d18b 7-link working 2023-10-10 22:41:52 -06:00
Peter Boyle
2111e7ab5f Run at physical mass 2023-10-06 21:20:21 -04:00
Peter Boyle
a751c42cc5 Checkpoint restore the setup 2023-10-06 21:03:08 -04:00
Peter Boyle
b58fd80379 I/O for coarse op and reorganise multigrid headers 2023-10-06 13:43:46 -04:00
Peter Boyle
3bc2da5321 Merge branch 'feature/scidac-wp1' of https://github.com/paboyle/Grid into feature/scidac-wp1 2023-10-05 16:57:59 -04:00
Peter Boyle
2d710d6bfd Optimised parameters for 16^3 2023-10-05 16:56:55 -04:00
Peter Boyle
6532b7f32b Eliminate older inefficient coarsening implementation 2023-10-05 16:56:15 -04:00
Peter Boyle
fcf5023845 Running on Frontier 2023-10-05 16:50:59 -04:00
Peter Boyle
737d3ffb98 ADEF1 and 1 hop projection 2023-10-03 14:22:18 -04:00
Peter Boyle
8a70314f54 Merge branch 'develop' into feature/scidac-wp1 2023-10-02 17:24:55 -04:00
Peter Boyle
c5f1420dea Merge remote-tracking branch 'LupoA/develop' into LupoA-develop 2023-10-02 16:22:35 -04:00
Peter Boyle
018e6da872 Merge pull request #440 from giltirn/feature/paddedcellgauge
Feature/paddedcellgauge
2023-10-02 10:00:42 -04:00
Peter Boyle
e187bcb85c Updating 2023-09-29 17:10:17 -04:00
Peter Boyle
be18ffe3b4 Further tuning and lanczos 2023-09-27 16:21:58 -04:00
Peter Boyle
3a86cce8c1 Compile 2023-09-27 16:19:18 -04:00
Peter Boyle
37884d369f Coarse space is expensive, but gives a speed up in fine matrix multiplies now.
Down to optimisation
2023-09-25 17:24:19 -04:00
Peter Boyle
9246e653cd Basic non-local coarsening of operator test 2023-09-25 17:20:58 -04:00