1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-22 01:32:03 +01:00
Commit Graph

213 Commits

Author SHA1 Message Date
dd13937bb6 Better opt face gather scatter 2023-12-22 18:03:38 -05:00
9feb801bb9 Much simpler GPU implementation 2023-12-21 15:24:06 -05:00
48d1f0df89 Optimised partially, working 2023-12-21 12:33:47 -05:00
332563e037 Debugged, reducing verbose 2023-12-21 12:30:57 -05:00
2c54be651c Further updates 2023-11-29 09:43:29 -05:00
0a3682ad0b MultiRHS work 2023-11-28 07:43:37 -05:00
031f85247c multRHS initial support -- needs optimisation for multi project/promote.
Bug fix in freeing intermediate grids to stop double free
2023-11-23 18:18:35 -05:00
100e29e35e Allow expression as argument to norm2 2023-11-15 18:00:44 -05:00
0e6fa6f6b8 DOn't need the Cshift for the period optimisation 2023-10-24 10:56:31 -04:00
aa5047a9e4 Faster blockProject blockPromote 2023-10-24 10:49:55 -04:00
1e79cc9cbe Avoid compiler error 2023-10-24 10:36:09 -04:00
9ab54c5565 Overlap comms & data copy/buffer assembly in Ghost zone exchange 2023-10-20 19:27:13 -04:00
5fac47a26d Faster halo exchange 2023-10-20 19:27:13 -04:00
f2b98d0dcc Const safety 2023-10-20 19:27:13 -04:00
80471bf762 Alternate implementation involving face operations 2023-10-20 19:27:13 -04:00
4d5f7e4377 Verbose change 2023-10-06 21:01:37 -04:00
3bc2da5321 Merge branch 'feature/scidac-wp1' of https://github.com/paboyle/Grid into feature/scidac-wp1 2023-10-05 16:57:59 -04:00
7b41b92d99 Only need to bad non-local dimensions 2023-10-05 16:55:48 -04:00
59b9d0e030 coalesceRead the blockSum 2023-10-05 16:54:48 -04:00
6a87487544 Running on Frontier, fix RNG big volume y2k, affecting 5D RNG 2023-10-05 16:50:59 -04:00
8a70314f54 Merge branch 'develop' into feature/scidac-wp1 2023-10-02 17:24:55 -04:00
c5f1420dea Merge remote-tracking branch 'LupoA/develop' into LupoA-develop 2023-10-02 16:22:35 -04:00
993f43ef4a Even odd use case 2023-09-07 10:53:06 -04:00
3e94838204 Spread out improvement 2023-08-25 17:31:28 -04:00
f44dce390f Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding
Fixed const correctness on PaddedCell methods
Fixed compile issues on Crusher
Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance
Optimized kernel for StaplePadded
Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings
2023-06-27 14:58:10 -04:00
bb71e9a96a Added PaddedCell and GeneralisedLocalStencil header includes to standard base headers
Moved versions of the padded-cell implementations of staple and rect-staple from test code to WilsonLoops header
Added StapleAndRectStapleAll which is now called by the plaq+rectangle action class. Under the hood it uses the padded cell implementations with maximal reuse of the padded gauge links
2023-06-27 11:23:30 -04:00
7b11075102 The user can now specify the implementation of Cshift used by the PaddedCell class through a virtual base class API. Implementations for default (regular Cshift) and for gauge links (which respects the gauge BCs)
Fixed const-correctness for PaddedCell and ConjugateGimpl::setDirections
Modified test code for padded-cell implementation of staple, rect-staple to use cconj BCs
2023-06-20 17:09:56 -04:00
e09dfbf1c2 definetely the right merge upstream/develop 2023-06-16 14:19:46 +01:00
9c8750f261 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2023-05-11 12:29:09 -04:00
e8c60c355b Padded cell code 2023-05-11 12:25:50 -04:00
f534523ede Debug 2023-05-11 12:23:11 -04:00
778291230a expand ProjecOnGaugeGroup, change ProjectOnSp2nAlgebra into SpTa, fixing some of its issues 2023-04-04 17:48:13 +01:00
026e736dfa Projection on algebra can now be templated. Fix #12 2023-04-03 16:31:19 +01:00
4a261fab30 Changes premerge to develop 2023-03-28 20:04:21 -07:00
5068413cdb Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2023-03-28 08:35:38 -07:00
71c6960eea Commet 2023-03-28 08:34:24 -07:00
ddf6d5c9e3 Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2023-03-28 11:33:05 -04:00
2376156fbc Merge branch 'develop' into feature/dirichlet 2023-03-27 21:33:50 -07:00
4ea48ef0c4 Merge pull request #419 from lehner/feature/gpt
Separate rankSum from sum
2023-03-24 15:42:16 -04:00
5c85774ee3 Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2023-03-24 15:40:57 -04:00
d8a9a745d8 stream synchronise 2023-03-24 15:40:30 -04:00
d57ed25071 Merge branch 'feature/dirichlet' into feature/block_lanczos22 2023-03-24 12:08:09 -04:00
281488611a WriteDiscard on construct 2023-03-23 10:28:50 -04:00
23298acb81 Merge pull request #424 from giltirn/feature/dirichlet-precchange
Precision change implementation
2023-03-22 23:04:52 -04:00
52384e34cf Discard on construct 2023-03-22 19:40:32 -04:00
d0bb033ea2 Device resident GPU block buffer instead of UVM as hit likely UVM
bug. Code worked on CUDA 11.4 but fails on later drivers (certainly 530.30.02, but need to
find the perlmutter driver version).
2023-03-22 19:07:32 -04:00
b5b759df73 Merge branch 'develop' into feature/dirichlet 2023-03-21 16:05:46 -04:00
bbbcd36ae5 Merge pull request #426 from rrhodgson/feature/LCDeflation
Batched Local Coherence Tools
2023-03-21 08:58:40 -04:00
39c0815d9e WriteDiscard 2023-03-21 08:57:29 -04:00
cbc053c3db Revert "projection on Sp2n algebra, to be used instead of Ta"
This reverts commit ba7f9d7b70.
2023-03-17 11:36:58 +00:00