portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-07 19:03:30 +01:00

Author	SHA1	Message	Date
Christopher Kelly	d161c2dc35	Improved formating of timing output in mixed-prec multishift In test of mixed-prec multishift, added comparison against full double precision multishift both for timing and to cross-check the results	2021-01-20 15:42:06 -05:00
Christopher Kelly	7a06826cf1	Added option to NerscIO to disable exit on failing plaquette check allowing for circumvention of factor of 2 error in CPS-generated G-parity config headers Adapted mixed-prec multi-shift test to new way to pass gauge BC directions and added cmdline option to perform the G-parity plaquette comparison with the corrected plaquette when loading config	2021-01-20 13:31:50 -05:00
Christopher Kelly	c3712b8e06	Merge branch 'develop' into feature/gparity_HMC	2021-01-20 11:48:52 -05:00
Christopher Kelly	901ee77b84	Mixed precision multishift test can now be performed with/without G-parity using cmdline check and can load a pregenerated configuration	2021-01-20 11:45:44 -05:00
Peter Boyle	b0339bc5a4	Merge branch 'feature/conjugate-bc-dirs' into develop	2021-01-15 09:28:39 -05:00
Peter Boyle	3c23a947cc	Fixed test for very much non-unit det	2021-01-15 09:16:02 -05:00
Peter Boyle	56111bb823	Merge branch 'develop' into feature/conjugate-bc-dirs	2021-01-14 21:01:22 -05:00
Peter Boyle	99445673f6	Gparity fix, and plaquette IO	2021-01-14 21:00:36 -05:00
Peter Boyle	97a59643f7	Red black coarse space	2021-01-14 20:49:13 -05:00
Peter Boyle	579595f547	Red black on coarse space	2021-01-14 20:48:35 -05:00
Peter Boyle	281ac5fc12	Red black support on coars	2021-01-14 20:48:08 -05:00
Peter Boyle	d8fa903b02	G5 on coarse spaces	2021-01-14 20:47:28 -05:00
Peter Boyle	eaff0f3aeb	Gamma5 on coaree spaces	2021-01-14 20:46:58 -05:00
Peter Boyle	e8e20c01b2	Coarsened vector test	2021-01-14 20:46:21 -05:00
Peter Boyle	a4afc3ea2a	Red black coarse space	2021-01-14 20:44:16 -05:00
Christopher Kelly	1b84f59273	Added a mixed precision multishift algorithm for which the matrix multiplies are performed in single precision but the search directions are accumulated in double precision. A reliable update step is performed at a tunable frequency to correct the residual. A final mixed-prec single-shift solve is performed on each pole to perform cleanup if necessary. A test is provided to demonstrate the algorithm.	2021-01-06 12:24:44 -05:00
Christopher Kelly	1fb41a4300	Added copyLane function to Tensor_extract_merge.h which copies one lane of data from an input tensor object to a different lane of an output tensor object of potentially different precision precisionChange lattice function now uses copyLane to remove need for temporary scalar objects, reducing register footprint and significantly improving performance	2021-01-06 11:50:56 -05:00
Christopher Kelly	287bac946f	ConjugateGradientMixedPrec now stores final true residual and uses the precisionChange workspaces for improved efficiency	2021-01-06 09:50:41 -05:00
Christopher Kelly	80c14be65e	Added core test to check precision change	2021-01-06 09:34:44 -05:00
Christopher Kelly	d7a2a4852d	Reimplemented precisionChange to run on GPUs. A workspace containing the mapping table can be optionally precomputed and reused for improved performance.	2021-01-06 09:30:49 -05:00
Christopher Kelly	d185f2eaa7	OneFlavourEvenOddRatioRationalPseudoFermionAction now derives from GeneralEvenOddRatioRationalPseudoFermionAction, simply performs transcription of parameters	2020-12-23 16:26:10 -05:00
Christopher Kelly	813d4cd900	Added test program that ensures the generic checkerboarded RHMC (with parameters set appropriately) gives the same answer as the existing 1f code	2020-12-23 16:01:42 -05:00
Christopher Kelly	75c6c6b173	General RHMC pseudofermion action now allows for different rational approximations to be used in the MD and action evaluation	2020-12-23 11:19:26 -05:00
Christopher Kelly	220ad5e3ee	Added more verbose log output to GeneralEvenOddRatioRationalPseudoFermionAction In GeneralEvenOddRatioRationalPseudoFermionAction, setting the bounds check frequency to 0 now disables the check	2020-12-22 11:08:22 -05:00
Christopher Kelly	ba5dc670a5	Reimplemented GparityWilsonImpl::InsertForce5D to run efficiently on GPUs Swapped order of templated tensor code and c-number specializations in Tensor_outer.h to fix compile issue with type deduction on Summit	2020-12-22 10:10:07 -05:00
Peter Boyle	3fe75bc7cb	Merge pull request #329 from nmeyer-ur/feature/a64fx-3 Revised dslash/dwf kernels for A64FX	2020-12-20 08:17:15 -05:00
Nils Meyer	45d49d8648	clean up	2020-12-19 03:35:18 +01:00
Nils Meyer	6013183361	removed Asm impls	2020-12-19 03:25:01 +01:00
Nils Meyer	4b882e8056	fixed lost bracket	2020-12-19 03:09:20 +01:00
Nils Meyer	3f9ae6e7e7	Merge branch 'develop' into feature/a64fx-3	2020-12-19 02:37:11 +01:00
Nils Meyer	909acd55cd	vnum variant for prefetches	2020-12-19 02:00:22 +01:00
Nils Meyer	4dd9e39e0d	up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1	2020-12-19 00:54:31 +01:00
Christopher Kelly	a0ca362690	Added an RHMC pseudofermion action, GeneralEvenOddRatioRationalPseudoFermionAction, that works for an arbitrary fractional power, not just a square root Added a test evolution for the above, Test_rhmc_EOWilsonRatioPowQuarter, demonstrating conservation of Hamiltonian Fixed HMC ignoring the MetropolisTest parameter of HMCparameters	2020-12-17 16:21:58 -05:00
Christopher Kelly	249b6e61ec	For G-parity BCs the Nd-1 direction is now assumed to be the time direction and setting a twist in this direction will apply antiperiodic BCs Added option to run Test_gparity with antiperiodic time BCs	2020-12-17 14:09:00 -05:00
Peter Boyle	7adb253e25	Merge pull request #328 from mmphys/feature/mrespatch Enable existing conserved current code for CUDA	2020-12-17 11:10:29 -05:00
Michael Marshall	873519e960	Enable existing conserved current code for CUDA (compiles OK for CUDA 10.1). Add option to Test_cayley_mres to load a configuration	2020-12-14 16:06:10 +00:00
Peter Boyle	9aec4a3c26	SYCL	2020-12-10 02:11:17 -08:00
Peter Boyle	70510d151b	Merge pull request #327 from paboyle/feature/gparity_twist_GPU Feature/gparity twist gpu	2020-12-07 12:02:20 -05:00
Christopher Kelly	9e7bacb5a4	Merge branch 'develop' into feature/gparity_twist_GPU	2020-12-07 11:55:39 -05:00
Christopher Kelly	2ef1fa66a8	Improved performance of G-parity kernel for GPUs by simplifying multLink implementation	2020-12-07 11:53:35 -05:00
Peter Boyle	cf76741ec6	Intel DPCPP Gold happy now (compiles all, runs Benchmark_dwf_fp32 )	2020-12-03 03:47:11 -08:00
Peter Boyle	497e7c1c40	Duplicate code	2020-12-02 17:55:30 -08:00
Peter Boyle	888eacd3b8	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2020-11-24 21:46:33 -05:00
Peter Boyle	321f0f51b5	Project to SU(N)	2020-11-24 21:46:10 -05:00
Peter Boyle	30ad9578a2	Merge branch 'lehner-feature/gpt' into develop	2020-11-24 06:10:24 -05:00
Peter Boyle	9dce101586	Merge branch 'feature/gpt' of https://github.com/lehner/Grid into lehner-feature/gpt	2020-11-24 06:10:16 -05:00
Peter Boyle	97e264d0ff	Christoph's changes	2020-11-23 15:46:11 +00:00
Peter Boyle	683a5e5bf5	Stencil use host vector for integera table on enable-shared=no and mirror it on device	2020-11-23 15:39:51 +00:00
Peter Boyle	d4861a362c	Stencil use non-UVM memory for look up table on enable-shared=no	2020-11-23 15:38:49 +00:00
Peter Boyle	5ff3eae027	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2020-11-20 13:14:44 -05:00

1 2 3 4 5 ...

6434 Commits