portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2025-06-18 07:47:06 +01:00

Author	SHA1	Message	Date
paboyle	3e947527cb	Move looping over "s" and "site" into kernels for GPU optimisatoin	2018-06-27 21:29:43 +01:00
paboyle	b710fec6ea	Gpu code first version of specialised kernel	2018-06-13 20:34:39 +01:00
Peter Boyle	eb7d34a4cc	GPU version	2018-05-14 19:41:47 -04:00
Peter Boyle	b15db11c60	Kernels -> pure static object to enable device execution	2018-03-24 19:35:20 -04:00
Peter Boyle	4e1272fabf	Kernels need to be static to work on GPU. No reference to host resident data	2018-03-22 18:44:53 -04:00
Peter Boyle	8a1d303ab9	GPU friendly stencil improvements	2018-03-19 07:11:03 -04:00
paboyle	3277bda130	View introduction to prepare for accelerator offload. Probably same problem exists for stencil object	2018-03-04 16:38:08 +00:00
paboyle	dcf6517a93	Accelerator offload and copy Opt into the kernel for GPU host var safety	2018-02-02 11:35:35 +00:00
paboyle	e4df025d01	Accelerator related	2018-02-01 23:20:05 +00:00
paboyle	87ee592176	Pragma changes and layout and warning elimination for nvcc	2018-01-24 13:14:09 +00:00
paboyle	a97ad1a51d	Namespce	2018-01-14 23:01:01 +00:00
Lanny91	1bd311ba9c	Faster sequential conserved current implementation, now compatible with 5D vectorisation & G-parity.	2017-06-16 16:43:15 +01:00
Lanny91	41af8c12d7	Code cleaning for conserved current contractions. Will now be easier to implement mobius conserved current.	2017-06-16 16:38:59 +01:00
Lanny91	5633a2db20	Faster implementation of conserved current site contraction. Added 5D vectorised support, but not G-parity.	2017-06-12 10:41:02 +01:00
Lanny91	ca1077c560	Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon # Conflicts: # lib/qcd/action/fermion/WilsonFermion5D.cc # tests/hadrons/Test_hadrons_rarekaon.cc	2017-05-04 16:22:33 +01:00
Peter Boyle	2ce898efa3	Pretty code	2017-04-26 02:34:25 -04:00
Lanny91	44260643f6	First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions.	2017-04-25 18:00:24 +01:00
paboyle	abba44a837	Hand unrolled for overlapped comms	2017-04-22 17:45:17 +01:00
Peter Boyle	1d1b225497	Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).	2017-04-22 09:05:28 -04:00
paboyle	736bf3c866	Major rework of stencil. Half precision and MPI3 now working.	2017-04-22 11:33:50 +01:00
paboyle	2c246551d0	Overlap comms and compute options in wilson kernels	2017-02-07 01:37:10 -05:00
Peter Boyle	caba0d42a5	L1p controls	2016-12-22 17:52:55 +00:00
azusayamaguchi	b7d55f7dfb	Fix a typo in reorg of the --dslash-asm	2016-11-04 11:35:08 +00:00
paboyle	bb94ddd0eb	Tidy up of mpi3; also some cleaning of the dslash controls.	2016-11-02 08:07:09 +00:00
azusayamaguchi	c190221fd3	Internal SHM comms in non-simd directions working Need to fix simd directions	2016-10-22 18:14:27 +01:00
paboyle	b58adc6a4b	commVector	2016-10-20 17:00:15 +01:00
Guido Cossu	c78bbd0f8c	Fix ASM compilation	2016-10-04 15:37:32 +01:00
Guido Cossu	b9c80318a2	Merge branch 'develop' into feature/hirep	2016-09-13 10:01:51 +01:00
Guido Cossu	f76f281e58	Cleaning files after fix	2016-09-09 11:34:25 +01:00
Guido Cossu	aa20cc8b52	Fixing compilation error with AVX512 flag	2016-09-09 02:58:52 -07:00
Guido Cossu	0fd179fb33	Merge branch 'develop' into feature/hirep	2016-09-01 12:59:53 +01:00
paboyle	90e70790f3	Feature for z-Mobius prep	2016-08-15 22:31:29 +01:00
Guido Cossu	089f0ab582	Debugged HMC for Creutz relation	2016-07-28 16:44:41 +01:00
Guido Cossu	b93e18ed50	Modified the Dirac Kernel class to compile with different number of colours Added the general push_back functionality to accomodate for all defined representations Compiles, not tested	2016-07-18 16:36:28 +01:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
paboyle	165bffc2e7	Avx512 changes for assembler kernels	2016-03-26 22:25:45 -06:00
paboyle	fc6ad65751	Pushed the overlap comms tweaks	2016-01-11 06:34:22 -08:00
paboyle	331768dcff	Added overlap comms compute mode	2016-01-03 01:38:11 +00:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
paboyle	34a0fde2ad	Fixes to fermion force terms after sign of gamma_mu (0...3) change. Thought I had already committed these. Believe I have got the Gparity fermion force working. * tests/Test_gpdwf_force.cc -- correctly predicts dS for two flavour pseudofermion based on a small dt update of U field. * tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21. Need to accumulate a full plaquette log to believe fully which will take some hours of run time.	2015-12-15 23:14:12 +00:00
paboyle	3ce10aa975	Fix a regression failure on Mobius; chroma regression added	2015-12-10 22:55:00 +00:00
paboyle	05a7029600	Stencil change	2015-11-07 00:06:31 -08:00
paboyle	899ca41cb8	Merge branch 'master' of github.com:paboyle/Grid Conflicts: lib/qcd/action/fermion/WilsonFermion5D.cc	2015-11-06 03:50:04 -08:00
paboyle	17af18dcab	Changes for AVX512 assembler	2015-11-06 03:45:51 -08:00
Peter Boyle	28022755ae	Stencil class name global change to StencilImpl typedef	2015-11-06 05:30:17 -06:00
Peter Boyle	2f38ebc446	Reintroducing the hand unrolled loops	2015-09-08 17:45:30 +01:00
Peter Boyle	55cfc89459	* Finished the template/policy style introduction of gparity, except the gparity force terms. So valence sector looks ok. FermionOperatorImpl.h provides the policy classes. Expect HMC will introduce a smearing policy and a fermion representation change policy template param. Will also probably need multi-precision work. * HMC is running even-odd and non-checkerboarded (checked 4^4 wilson fermion/wilson gauge). There appears to be a bug in the multi-level integrator -- <e-dH> passes with single level but not with multi-level. In any case there looks to be quite a bit to clean up. This is the "const det" style implementation that is not appropriate yet for clover since it assumes that Mee is indept of the gauge fields. Easily fixed in future.	2015-08-15 23:25:49 +01:00

1 2

55 Commits