dd13937bb6
Better opt face gather scatter
2023-12-22 18:03:38 -05:00
9feb801bb9
Much simpler GPU implementation
2023-12-21 15:24:06 -05:00
48d1f0df89
Optimised partially, working
2023-12-21 12:33:47 -05:00
332563e037
Debugged, reducing verbose
2023-12-21 12:30:57 -05:00
2c54be651c
Further updates
2023-11-29 09:43:29 -05:00
0a3682ad0b
MultiRHS work
2023-11-28 07:43:37 -05:00
031f85247c
multRHS initial support -- needs optimisation for multi project/promote.
...
Bug fix in freeing intermediate grids to stop double free
2023-11-23 18:18:35 -05:00
100e29e35e
Allow expression as argument to norm2
2023-11-15 18:00:44 -05:00
0e6fa6f6b8
DOn't need the Cshift for the period optimisation
2023-10-24 10:56:31 -04:00
aa5047a9e4
Faster blockProject blockPromote
2023-10-24 10:49:55 -04:00
1e79cc9cbe
Avoid compiler error
2023-10-24 10:36:09 -04:00
9ab54c5565
Overlap comms & data copy/buffer assembly in Ghost zone exchange
2023-10-20 19:27:13 -04:00
5fac47a26d
Faster halo exchange
2023-10-20 19:27:13 -04:00
f2b98d0dcc
Const safety
2023-10-20 19:27:13 -04:00
80471bf762
Alternate implementation involving face operations
2023-10-20 19:27:13 -04:00
4d5f7e4377
Verbose change
2023-10-06 21:01:37 -04:00
3bc2da5321
Merge branch 'feature/scidac-wp1' of https://github.com/paboyle/Grid into feature/scidac-wp1
2023-10-05 16:57:59 -04:00
7b41b92d99
Only need to bad non-local dimensions
2023-10-05 16:55:48 -04:00
59b9d0e030
coalesceRead the blockSum
2023-10-05 16:54:48 -04:00
6a87487544
Running on Frontier, fix RNG big volume y2k, affecting 5D RNG
2023-10-05 16:50:59 -04:00
8a70314f54
Merge branch 'develop' into feature/scidac-wp1
2023-10-02 17:24:55 -04:00
c5f1420dea
Merge remote-tracking branch 'LupoA/develop' into LupoA-develop
2023-10-02 16:22:35 -04:00
993f43ef4a
Even odd use case
2023-09-07 10:53:06 -04:00
3e94838204
Spread out improvement
2023-08-25 17:31:28 -04:00
f44dce390f
Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding
...
Fixed const correctness on PaddedCell methods
Fixed compile issues on Crusher
Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance
Optimized kernel for StaplePadded
Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings
2023-06-27 14:58:10 -04:00
bb71e9a96a
Added PaddedCell and GeneralisedLocalStencil header includes to standard base headers
...
Moved versions of the padded-cell implementations of staple and rect-staple from test code to WilsonLoops header
Added StapleAndRectStapleAll which is now called by the plaq+rectangle action class. Under the hood it uses the padded cell implementations with maximal reuse of the padded gauge links
2023-06-27 11:23:30 -04:00
7b11075102
The user can now specify the implementation of Cshift used by the PaddedCell class through a virtual base class API. Implementations for default (regular Cshift) and for gauge links (which respects the gauge BCs)
...
Fixed const-correctness for PaddedCell and ConjugateGimpl::setDirections
Modified test code for padded-cell implementation of staple, rect-staple to use cconj BCs
2023-06-20 17:09:56 -04:00
e09dfbf1c2
definetely the right merge upstream/develop
2023-06-16 14:19:46 +01:00
9c8750f261
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2023-05-11 12:29:09 -04:00
e8c60c355b
Padded cell code
2023-05-11 12:25:50 -04:00
f534523ede
Debug
2023-05-11 12:23:11 -04:00
778291230a
expand ProjecOnGaugeGroup, change ProjectOnSp2nAlgebra into SpTa, fixing some of its issues
2023-04-04 17:48:13 +01:00
026e736dfa
Projection on algebra can now be templated. Fix #12
2023-04-03 16:31:19 +01:00
4a261fab30
Changes premerge to develop
2023-03-28 20:04:21 -07:00
5068413cdb
Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet
2023-03-28 08:35:38 -07:00
71c6960eea
Commet
2023-03-28 08:34:24 -07:00
ddf6d5c9e3
Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet
2023-03-28 11:33:05 -04:00
2376156fbc
Merge branch 'develop' into feature/dirichlet
2023-03-27 21:33:50 -07:00
4ea48ef0c4
Merge pull request #419 from lehner/feature/gpt
...
Separate rankSum from sum
2023-03-24 15:42:16 -04:00
5c85774ee3
Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet
2023-03-24 15:40:57 -04:00
d8a9a745d8
stream synchronise
2023-03-24 15:40:30 -04:00
d57ed25071
Merge branch 'feature/dirichlet' into feature/block_lanczos22
2023-03-24 12:08:09 -04:00
281488611a
WriteDiscard on construct
2023-03-23 10:28:50 -04:00
23298acb81
Merge pull request #424 from giltirn/feature/dirichlet-precchange
...
Precision change implementation
2023-03-22 23:04:52 -04:00
52384e34cf
Discard on construct
2023-03-22 19:40:32 -04:00
d0bb033ea2
Device resident GPU block buffer instead of UVM as hit likely UVM
...
bug. Code worked on CUDA 11.4 but fails on later drivers (certainly 530.30.02, but need to
find the perlmutter driver version).
2023-03-22 19:07:32 -04:00
b5b759df73
Merge branch 'develop' into feature/dirichlet
2023-03-21 16:05:46 -04:00
bbbcd36ae5
Merge pull request #426 from rrhodgson/feature/LCDeflation
...
Batched Local Coherence Tools
2023-03-21 08:58:40 -04:00
39c0815d9e
WriteDiscard
2023-03-21 08:57:29 -04:00
cbc053c3db
Revert "projection on Sp2n algebra, to be used instead of Ta"
...
This reverts commit ba7f9d7b70
.
2023-03-17 11:36:58 +00:00