1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 18:51:02 +01:00
Commit Graph

8168 Commits

Author SHA1 Message Date
Christopher Kelly 1db58a8acc Precision change improvements
Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation.

In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes.

Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces

Renamed the original precisionChange as precisionChangeOrig

Fixed incorrect pointer offset bug in copyLane

Added a test and a benchmark for precisionChange

Added a test for reliable update CG
2023-02-21 10:52:42 -05:00
rhodgson 920a51438d Added batched Mixed precision CG 2023-02-14 17:04:13 +00:00
rhodgson be528b6d27 Add batched block project/promote functions 2023-02-14 14:37:10 +00:00
Alessandro Lupo f73691ec47 Merge pull request #18 from nickforce989/sp2n/newbranch
Sp2n/newbranch
2023-02-13 10:22:27 +01:00
Peter Boyle ccd21f96ff Plaquette agreeing and moving to final form (slowly) need to optimise 2023-02-01 22:57:44 -05:00
Peter Boyle 4b90cb8888 First cut passes combining padded cell with general stencil towards fast plaquette and staggered force 2023-02-01 22:14:10 -05:00
Niccolo Forzano 7ebda3e9ec Merge commit 'b10e1b7bc8bec809f874e9e48a3ccc7b2619c9d1' into sp2n/newbranch 2023-01-19 12:10:18 +00:00
Niccolo Forzano b10e1b7bc8 Fixed files giving zero force computation on GPU, issue #8 2023-01-18 18:04:47 +00:00
Peter Boyle 796abfad80 Merge pull request #422 from fjosw/fix/NVCC_DIAG_PRAGMA_SUPPORT
Disable diagnostic pragma warnings for CUDA 12+
2023-01-17 09:34:49 -05:00
fjosw ad0270ac8c fix: diagnostic pragma warnings fixed for CUDA 12+ 2023-01-12 12:36:30 +00:00
Makis Kappas 7d62f1d6d2 Populate the Cshift_table in the GPU
Cshift is allocated in Unified memory and used
in the LambdaApply kernels but also populated
from the host. This creates a lot of Unified HtoD
and DtoH mem operations and has a negative effect
in performance. With this commit we populate the
Cshift table in the device with the
populate_Cshift_table() kernel.
2023-01-11 21:26:25 +00:00
Christoph Lehner 458c943987 merged upstream 2022-12-31 11:16:21 +02:00
Christoph Lehner 88015b0858 Split sum in rankSum and GlobalSum 2022-12-26 10:01:32 +01:00
Peter Boyle 4ca1bf7cca Added gauge invariance test 2022-12-21 07:23:16 -05:00
Peter Boyle 2ff868f7a5 CPU open doesn't need to free space 2022-12-20 05:10:23 -05:00
Peter Boyle ede02b6883 Memory manager debug Felix case 2022-12-20 05:10:23 -05:00
Peter Boyle 1822ced302 Bug fix 2022-12-20 05:10:23 -05:00
Peter Boyle 37ba32776f More logging 2022-12-20 05:10:23 -05:00
Peter Boyle 99b3697b03 More loggin 2022-12-20 05:10:23 -05:00
Peter Boyle 43a45ec97b SSC_START 2022-12-20 05:10:23 -05:00
Peter Boyle b00a4142e5 A=A fix 2022-12-20 05:10:23 -05:00
Peter Boyle 3791bc527b Logging pulled in from dirichlet branch 2022-12-20 05:10:23 -05:00
Alessandro Lupo d7dea44ce7 Merge pull request #17 from chillenzer/unify_gauge_groups
Fix compilation error in nvcc (closes #15)
2022-12-19 16:24:03 +00:00
Peter Boyle d8c29f5fcf Updated FFT test for PETSc 2022-12-18 12:05:00 -05:00
Julian Lenz 37b6b82869 Fix file extensions 2022-12-18 16:12:56 +00:00
Julian Lenz 92ad5b8f74 Compiler error fix: NVCC requires names for templ. par. 2022-12-18 15:50:19 +00:00
Peter Boyle 281f8101fe Matt FFT test 2022-12-17 20:35:33 -05:00
Peter Boyle 472ed2dd5c Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-12-17 20:17:09 -05:00
Peter Boyle 4f85672674 Simpler test for PETSc 2022-12-17 20:16:11 -05:00
Peter Boyle dc747c54be Merge branch 'develop' into feature/dirichlet
Conflicts:
	Grid/qcd/action/fermion/WilsonCompressor.h
	Grid/stencil/Stencil.h
2022-12-13 08:24:58 -05:00
Peter Boyle 140684d706 Head to head vs HMC 2022-12-13 08:15:38 -05:00
Peter Boyle 5bb7ba92fa Test for DDHMC force term 2022-12-13 08:15:11 -05:00
Peter Boyle b54d0f3c73 Smaller deltaH down to 7000s on t=0.5 trajectory 2022-12-13 08:14:27 -05:00
Peter Boyle ff6777a98d Variable depth experiments 2022-12-13 08:13:51 -05:00
Peter Boyle 07acfe89f2 Merge pull request #417 from rrhodgson/feature/fermtoprop
Feature/fermtoprop
2022-12-06 12:45:03 -05:00
rhodgson 40234f531f FermToProp accelerator_for -> thread_for 2022-12-06 17:34:51 +00:00
rhodgson d49694f38f PropToFerm fix 2022-12-06 15:48:54 +00:00
Alessandro Lupo 8c80f1c168 Merge pull request #14 from chillenzer/unify_gauge_groups
Unify gauge groups (closes #5)
2022-12-01 17:35:46 +00:00
Chulwoo Jung dc6a38f177 Minor cleanup 2022-11-30 17:13:12 -05:00
Chulwoo Jung 82c1ecf60f Block lanczos added 2022-11-30 16:08:40 -05:00
Peter Boyle 67f569354e Partial dirichlet changes 2022-11-30 15:51:13 -05:00
Peter Boyle 97a098636d FermToProp 2022-11-30 15:36:35 -05:00
Peter Boyle e13930c8b2 Faster fermtoprop case 2022-11-30 15:11:29 -05:00
Julian Lenz 0af7d5a793 Rename Grid/qcd/utils/<Group>_impl.h -> Grid/qcd/utils/<Group>.h 2022-11-30 17:12:00 +00:00
Julian Lenz 505fa49983 Renamed SUn.h -> GaugeGroup.h 2022-11-30 17:09:48 +00:00
Julian Lenz 7bcf33def9 Removed Sp2n.h 2022-11-30 16:59:46 +00:00
Julian Lenz a13820656a Removed iSUnMatrix, etc. 2022-11-30 15:09:03 +00:00
Julian Lenz fa71b46a41 Hide nsp 2022-11-30 14:44:23 +00:00
Julian Lenz b8b3ae6ac1 Make helper functions private 2022-11-30 13:29:14 +00:00
Julian Lenz 55c008da21 Removed forward declaration 2022-11-30 13:12:21 +00:00