1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-09-20 01:05:38 +01:00
Commit Graph

1399 Commits

Author SHA1 Message Date
Peter Boyle
018e6da872
Merge pull request #440 from giltirn/feature/paddedcellgauge
Feature/paddedcellgauge
2023-10-02 10:00:42 -04:00
Peter Boyle
b8a7004365 Partial fraction test 2023-08-14 15:17:03 -04:00
Christopher Kelly
f44dce390f Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding
Fixed const correctness on PaddedCell methods
Fixed compile issues on Crusher
Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance
Optimized kernel for StaplePadded
Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings
2023-06-27 14:58:10 -04:00
Christopher Kelly
6f6844ccf1 Added new StapleAll and RectStapleAll functions that return the staples for all mu as an array
Modified plaq+rectangle gauge actions to use the above
Added a test code to confirm the above changes
2023-06-26 15:48:47 -04:00
Christopher Kelly
4c6613d72c Modified RectStapleDouble and RectStapleOptimised to use Gauge-BC respecting CshiftLink
Added test code tests/debug/Test_optimized_staple_gaugebc demonstrating equivalence of above to RectStapleUnoptimised for cconj gauge BCs
Removed optimized staple only being used for periodic gauge BCs; it is now always used
2023-06-26 10:20:23 -04:00
Christopher Kelly
4241c7d4a3 Imported coalescedReadGeneralPermute GPU implementation from Christoph
Fixed bug in padded staple code where extract was being called on the result before the GPU view was closed
Fixed compile issue with pointer cast in padded staple code
Added timing summaries of padded staple code and timing breakdown of staple implementation to Test_padded_cell_staple
2023-06-21 16:01:01 -04:00
Christopher Kelly
7b11075102 The user can now specify the implementation of Cshift used by the PaddedCell class through a virtual base class API. Implementations for default (regular Cshift) and for gauge links (which respects the gauge BCs)
Fixed const-correctness for PaddedCell and ConjugateGimpl::setDirections
Modified test code for padded-cell implementation of staple, rect-staple to use cconj BCs
2023-06-20 17:09:56 -04:00
Christopher Kelly
abc658dca5 Added coalescedReadGeneralPermute CPU implementation based on Christoph's GPT code
In a test code, implemented a padded-cell version of the staple and rectangular-staple calculation
2023-06-20 16:14:25 -04:00
Peter Boyle
f1c358b596 Additional tests 2023-06-15 10:43:04 -04:00
Peter Boyle
5465961e30 New test for FTHMC portion 2023-06-01 06:14:04 -04:00
Peter Boyle
9c8750f261 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2023-05-11 12:29:09 -04:00
Peter Boyle
91efd08179 Option for Qlat generator basis 2023-05-11 12:27:45 -04:00
Peter Boyle
1b8a834beb Debug 2023-05-11 12:22:24 -04:00
Peter Boyle
bd891fb3f5 tests to compile 2023-04-12 18:32:44 -04:00
Peter Boyle
866f48391a Temporary fix for develop incorrect results 2023-03-30 17:10:13 -04:00
Peter Boyle
c42e25e5b8 Dirichlet remove 2023-03-29 16:25:52 -04:00
Peter Boyle
d57ed25071
Merge branch 'feature/dirichlet' into feature/block_lanczos22 2023-03-24 12:08:09 -04:00
Peter Boyle
8a1b9073f9 Mshift update 2023-03-23 15:39:30 -04:00
Peter Boyle
3f385f717c Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet
Conflicts:
	systems/PVC/benchmarks/run-2tile-mpi.sh
	systems/PVC/config-command
2023-03-23 14:52:53 -04:00
Peter Boyle
23298acb81
Merge pull request #424 from giltirn/feature/dirichlet-precchange
Precision change implementation
2023-03-22 23:04:52 -04:00
Peter Boyle
c6621806ca Compiling on laptop and running 2023-03-21 17:27:09 -04:00
Peter Boyle
b5b759df73 Merge branch 'develop' into feature/dirichlet 2023-03-21 16:05:46 -04:00
Peter Boyle
7db8dd7a95 Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2023-03-21 16:04:27 -04:00
Peter Boyle
f17f879206 Test update 2023-03-21 15:59:29 -04:00
Christopher Kelly
e82cf1d311 Further prec-change improvements
Mixed prec CG algorithm has been modified to precompute precision change workspaces

As the original Test_dwf_mixedcg_prec has been coopted to do a performance stability and reproducibility test, requiring the single-prec CG to be run 200 times, I have created a new version of Test_dwf_mixedcg_prec in the solver subdirectory that just does the mixed vs double CG test
2023-02-23 09:45:29 -05:00
Christopher Kelly
1db58a8acc Precision change improvements
Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation.

In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes.

Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces

Renamed the original precisionChange as precisionChangeOrig

Fixed incorrect pointer offset bug in copyLane

Added a test and a benchmark for precisionChange

Added a test for reliable update CG
2023-02-21 10:52:42 -05:00
Peter Boyle
ccd21f96ff Plaquette agreeing and moving to final form (slowly) need to optimise 2023-02-01 22:57:44 -05:00
Peter Boyle
4b90cb8888 First cut passes combining padded cell with general stencil towards fast plaquette and staggered force 2023-02-01 22:14:10 -05:00
Peter Boyle
4ca1bf7cca Added gauge invariance test 2022-12-21 07:23:16 -05:00
Peter Boyle
ede02b6883 Memory manager debug Felix case 2022-12-20 05:10:23 -05:00
Peter Boyle
d8c29f5fcf Updated FFT test for PETSc 2022-12-18 12:05:00 -05:00
Peter Boyle
281f8101fe Matt FFT test 2022-12-17 20:35:33 -05:00
Peter Boyle
472ed2dd5c Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-12-17 20:17:09 -05:00
Peter Boyle
4f85672674 Simpler test for PETSc 2022-12-17 20:16:11 -05:00
Peter Boyle
5bb7ba92fa Test for DDHMC force term 2022-12-13 08:15:11 -05:00
Chulwoo Jung
dc6a38f177 Minor cleanup 2022-11-30 17:13:12 -05:00
Chulwoo Jung
82c1ecf60f Block lanczos added 2022-11-30 16:08:40 -05:00
Peter Boyle
3dbfce5223 Tests clean build on HIP 2022-11-16 20:15:51 -05:00
Peter Boyle
e51eaedc56 Making tests compile 2022-11-15 22:58:30 -05:00
Peter Boyle
a3927a8a27 Dirichlet 2022-11-02 20:22:27 -04:00
Peter Boyle
c82b164f6b Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
Christopher Kelly
66d001ec9e Refactored Wilson flow class; previously the class implemented both iterative and adaptive smearing, but only the iterative method was accessible through the Smearing base class. The implementation of Smearing also forced a clunky need to pass iterative smearing parameters through the constructor but adaptive smearing parameters through the function call. Now there is a WilsonFlowBase class that implements common functionality, and separate WilsonFlow (iterative) and WilsonFlowAdaptive (adaptive) classes, both of which implement Smearing virtual functions.
Modified the Wilson flow adaptive smearing step size update to implement the original Ramos definition of the distance, where previously it used the norm of a difference which scales with the volume and so would choose too coarse or too fine steps depending on the volume. This is based on Chulwoo's code.

Added a test comparing adaptive (with tuneable tolerance) to iterative Wilson flow smearing on a random gauge configuration.
2022-10-03 10:59:38 -04:00
Christopher Kelly
19da647e3c Added support for non-periodic gauge field implementations in the random gauge shift performed at the start of the HMC trajectory
(The above required exposing the gauge implementation to the HMC class through the Integrator class)
Made the random shift optional (default on) through a parameter in HMCparameters
Modified ConjugateBC::CshiftLink such that it supports any shift in  -L < shift < L rather than just +-1
Added a tester for the BC-respecting Cshift
Fixed a missing system header include in SSE4 intrinsics wrapper
Fixed sumD_cpu for single-prec types performing an incorrect conversion to a single-prec data type at the end, that fails to compile on some systems
2022-09-09 12:47:09 -04:00
Peter Boyle
1177b8f661 Merge branch 'develop' into feature/dirichlet 2022-08-31 19:05:57 -04:00
Peter Boyle
3c1c51f9aa Merge branch 'feature/dirichlet-gparity' into feature/dirichlet 2022-08-31 18:25:34 -04:00
Peter Boyle
8cc3c522c3
Merge pull request #409 from giltirn/feature/dirichlet-gparity-stage
Import round 5
2022-08-31 18:22:50 -04:00
Gurtej Kanwar
60dfb49afa Remove FP16 tests when FP16 is disabled 2022-08-21 17:29:55 +02:00
Peter Boyle
9b20f1449c Better timing 2022-07-28 11:37:12 -04:00
Christopher Kelly
33e4a0caee Imported changes from feature/gparity_HMC branch:
Rework of WilsonFlow class
		Fixed logic error in smear method where the step index was initialized to 1 rather than 0, resulting in the logged output value of tau being too large by epsilon
		Previously smear_adaptive would maintain the current value of tau as a class member variable whereas smear would compute it separately; now both methods maintain the current value internally and it is updated by the evolve_step routines. Both evolve methods are now const.
		smear_adaptive now also maintains the current value of epsilon internally, allowing it to be a const method and also allowing the same class instance to be reused without needing to be reset
		Replaced the fixed evaluation of the plaquette energy density and plaquette topological charge during the smearing with a highly flexible general strategy where the user can add arbitrary measurements as functional objects that are evaluated at an arbitrary frequency
	        By default the same plaquette-based measurements are performed, but additional example functions are provided where the smearing is performed with different choices of measurement that are returned as an array for further processing
		Added a method to compute the energy density using the Cloverleaf approach which has smaller discretization errors
	Added a new tensor utility operation, copyLane, which allows for the copying of a single SIMD lane between two instances of the same tensor type but potentially different precisions
	To LocalCoherenceLanczos, added the option to compute the high/low eval of the fine operator on every restart to aid in tuning the Chebyshev
	Added Test_field_array_io which demonstrates and tests a single-file write of an arbitrary array of fields
	Added Test_evec_compression which generates evecs using Lanczos and attempts to compress them using the local coherence technique
	Added Test_compressed_lanczos_gparity which demonstrates the local coherence Lanczos for G-parity BCs
	Added HMC main programs for the 40ID and 48ID G-parity lattices
2022-07-01 14:12:12 -04:00
Peter Boyle
1f903d9296 Merge branch 'feature/dirichlet' into feature/dirichlet-gparity 2022-07-01 12:12:50 -04:00