1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 10:41:01 +01:00
Commit Graph

7143 Commits

Author SHA1 Message Date
Peter Boyle 6252ffaf76 No unified 2023-04-03 18:25:22 -04:00
Peter Boyle dcf172da3b Merge pull request #415 from paboyle/feature/block_lanczos22
Feature/block lanczos22
2023-03-24 12:08:16 -04:00
Peter Boyle d57ed25071 Merge branch 'feature/dirichlet' into feature/block_lanczos22 2023-03-24 12:08:09 -04:00
Peter Boyle 8a1b9073f9 Mshift update 2023-03-23 15:39:30 -04:00
Peter Boyle 1a7114d4b9 Temporary algorithm while sorting out mixed prec 2023-03-23 15:38:35 -04:00
Peter Boyle 3f385f717c Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet
Conflicts:
	systems/PVC/benchmarks/run-2tile-mpi.sh
	systems/PVC/config-command
2023-03-23 14:52:53 -04:00
Peter Boyle c180a52518 Merge branch 'feature/dirichlet' of https://www.github.com/paboyle/Grid into feature/dirichlet 2023-03-23 10:28:01 -04:00
Peter Boyle 90130e25e9 TODO list 2023-03-23 10:27:02 -04:00
Peter Boyle 23298acb81 Merge pull request #424 from giltirn/feature/dirichlet-precchange
Precision change implementation
2023-03-22 23:04:52 -04:00
Peter Boyle 52384e34cf Discard on construct 2023-03-22 19:40:32 -04:00
Peter Boyle d0bb033ea2 Device resident GPU block buffer instead of UVM as hit likely UVM
bug. Code worked on CUDA 11.4 but fails on later drivers (certainly 530.30.02, but need to
find the perlmutter driver version).
2023-03-22 19:07:32 -04:00
Peter Boyle c6621806ca Compiling on laptop and running 2023-03-21 17:27:09 -04:00
Peter Boyle 0b6f0f6d2f Merge branch 'feature/dirichlet' of https://www.github.com/paboyle/Grid into feature/dirichlet 2023-03-21 16:06:55 -04:00
Peter Boyle b5b759df73 Merge branch 'develop' into feature/dirichlet 2023-03-21 16:05:46 -04:00
Peter Boyle 7db8dd7a95 Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2023-03-21 16:04:27 -04:00
Peter Boyle 8b43be39c0 Config command 2023-03-21 16:00:52 -04:00
Peter Boyle f17f879206 Test update 2023-03-21 15:59:29 -04:00
Peter Boyle 68428fceab Integrator update 2023-03-21 15:58:49 -04:00
Peter Boyle 4135f2dcd1 Compressor 2023-03-21 15:41:41 -04:00
Peter Boyle c5bdf61215 AUdit fix 2023-03-21 15:38:39 -04:00
Peter Boyle 88e218e8ee Stencil updates 2023-03-21 15:37:58 -04:00
Peter Boyle 0f2b786436 Vector -> vector 2023-03-21 15:36:11 -04:00
Peter Boyle e1c326558a COmms improvements 2023-03-21 08:53:56 -07:00
Peter Boyle 39c0815d9e WriteDiscard 2023-03-21 08:57:29 -04:00
Peter Boyle a997d24743 Remove nofma 2023-03-14 12:10:31 -07:00
Peter Boyle 861e5d7f4c SYCL version update. Why do they keep making incompatible changes 2023-03-14 12:10:02 -07:00
Peter Boyle 14cc142a14 Warning remove 2023-03-14 12:09:26 -07:00
Peter Boyle f36b87deb5 syscall fix 2023-03-14 12:09:00 -07:00
Peter Boyle eeb6e0a6e3 Renable cache blocking and efficient UPI type SHM comms 2023-03-14 09:10:27 -07:00
Peter Boyle cad5b187dd Cleanup 2023-03-14 09:08:16 -07:00
Peter Boyle 87697eb07e SHared compile 2023-03-14 09:07:36 -07:00
Christopher Kelly 83d86943db Fixed compile bug in MemoryManagerShared caused by Audit function not being passed a string 2023-02-23 13:09:45 -05:00
Christopher Kelly e82cf1d311 Further prec-change improvements
Mixed prec CG algorithm has been modified to precompute precision change workspaces

As the original Test_dwf_mixedcg_prec has been coopted to do a performance stability and reproducibility test, requiring the single-prec CG to be run 200 times, I have created a new version of Test_dwf_mixedcg_prec in the solver subdirectory that just does the mixed vs double CG test
2023-02-23 09:45:29 -05:00
Christopher Kelly 1db58a8acc Precision change improvements
Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation.

In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes.

Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces

Renamed the original precisionChange as precisionChangeOrig

Fixed incorrect pointer offset bug in copyLane

Added a test and a benchmark for precisionChange

Added a test for reliable update CG
2023-02-21 10:52:42 -05:00
Peter Boyle 796abfad80 Merge pull request #422 from fjosw/fix/NVCC_DIAG_PRAGMA_SUPPORT
Disable diagnostic pragma warnings for CUDA 12+
2023-01-17 09:34:49 -05:00
fjosw ad0270ac8c fix: diagnostic pragma warnings fixed for CUDA 12+ 2023-01-12 12:36:30 +00:00
Peter Boyle 4ca1bf7cca Added gauge invariance test 2022-12-21 07:23:16 -05:00
Peter Boyle 2ff868f7a5 CPU open doesn't need to free space 2022-12-20 05:10:23 -05:00
Peter Boyle ede02b6883 Memory manager debug Felix case 2022-12-20 05:10:23 -05:00
Peter Boyle 1822ced302 Bug fix 2022-12-20 05:10:23 -05:00
Peter Boyle 37ba32776f More logging 2022-12-20 05:10:23 -05:00
Peter Boyle 99b3697b03 More loggin 2022-12-20 05:10:23 -05:00
Peter Boyle 43a45ec97b SSC_START 2022-12-20 05:10:23 -05:00
Peter Boyle b00a4142e5 A=A fix 2022-12-20 05:10:23 -05:00
Peter Boyle 3791bc527b Logging pulled in from dirichlet branch 2022-12-20 05:10:23 -05:00
Peter Boyle d8c29f5fcf Updated FFT test for PETSc 2022-12-18 12:05:00 -05:00
Peter Boyle 281f8101fe Matt FFT test 2022-12-17 20:35:33 -05:00
Peter Boyle 472ed2dd5c Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-12-17 20:17:09 -05:00
Peter Boyle 4f85672674 Simpler test for PETSc 2022-12-17 20:16:11 -05:00
Peter Boyle dc747c54be Merge branch 'develop' into feature/dirichlet
Conflicts:
	Grid/qcd/action/fermion/WilsonCompressor.h
	Grid/stencil/Stencil.h
2022-12-13 08:24:58 -05:00