1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-15 02:05:37 +00:00
Commit Graph

57 Commits

Author SHA1 Message Date
Peter Boyle
b2b5137d28 Finally starting to get decent performance on Volta 2018-07-13 12:06:18 -04:00
Peter Boyle
b1265ae867 Prettify code 2018-07-05 07:08:06 -04:00
paboyle
3e947527cb Move looping over "s" and "site" into kernels for GPU optimisatoin 2018-06-27 21:29:43 +01:00
paboyle
b710fec6ea Gpu code first version of specialised kernel 2018-06-13 20:34:39 +01:00
Peter Boyle
eb7d34a4cc GPU version 2018-05-14 19:41:47 -04:00
Peter Boyle
b15db11c60 Kernels -> pure static object to enable device execution 2018-03-24 19:35:20 -04:00
Peter Boyle
4e1272fabf Kernels need to be static to work on GPU. No reference to host resident data 2018-03-22 18:44:53 -04:00
Peter Boyle
8a1d303ab9 GPU friendly stencil improvements 2018-03-19 07:11:03 -04:00
paboyle
3277bda130 View introduction to prepare for accelerator offload.
Probably same problem exists for stencil object
2018-03-04 16:38:08 +00:00
paboyle
dcf6517a93 Accelerator offload and copy Opt into the kernel for GPU host var safety 2018-02-02 11:35:35 +00:00
paboyle
e4df025d01 Accelerator related 2018-02-01 23:20:05 +00:00
paboyle
87ee592176 Pragma changes and layout and warning elimination for nvcc 2018-01-24 13:14:09 +00:00
paboyle
a97ad1a51d Namespce 2018-01-14 23:01:01 +00:00
Lanny91
1bd311ba9c Faster sequential conserved current implementation, now compatible with 5D vectorisation & G-parity. 2017-06-16 16:43:15 +01:00
Lanny91
41af8c12d7 Code cleaning for conserved current contractions. Will now be easier to implement mobius conserved current. 2017-06-16 16:38:59 +01:00
Lanny91
5633a2db20 Faster implementation of conserved current site contraction. Added 5D vectorised support, but not G-parity. 2017-06-12 10:41:02 +01:00
Lanny91
ca1077c560 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon
# Conflicts:
#	lib/qcd/action/fermion/WilsonFermion5D.cc
#	tests/hadrons/Test_hadrons_rarekaon.cc
2017-05-04 16:22:33 +01:00
Peter Boyle
2ce898efa3 Pretty code 2017-04-26 02:34:25 -04:00
Lanny91
44260643f6 First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions. 2017-04-25 18:00:24 +01:00
paboyle
abba44a837 Hand unrolled for overlapped comms 2017-04-22 17:45:17 +01:00
Peter Boyle
1d1b225497 Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). 2017-04-22 09:05:28 -04:00
paboyle
736bf3c866 Major rework of stencil. Half precision and MPI3 now working. 2017-04-22 11:33:50 +01:00
paboyle
2c246551d0 Overlap comms and compute options in wilson kernels 2017-02-07 01:37:10 -05:00
Peter Boyle
caba0d42a5 L1p controls 2016-12-22 17:52:55 +00:00
azusayamaguchi
b7d55f7dfb Fix a typo in reorg of the --dslash-asm 2016-11-04 11:35:08 +00:00
paboyle
bb94ddd0eb Tidy up of mpi3; also some cleaning of the dslash controls. 2016-11-02 08:07:09 +00:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
paboyle
b58adc6a4b commVector 2016-10-20 17:00:15 +01:00
Guido Cossu
c78bbd0f8c Fix ASM compilation 2016-10-04 15:37:32 +01:00
Guido Cossu
b9c80318a2 Merge branch 'develop' into feature/hirep 2016-09-13 10:01:51 +01:00
Guido Cossu
f76f281e58 Cleaning files after fix 2016-09-09 11:34:25 +01:00
Guido Cossu
aa20cc8b52 Fixing compilation error with AVX512 flag 2016-09-09 02:58:52 -07:00
Guido Cossu
0fd179fb33 Merge branch 'develop' into feature/hirep 2016-09-01 12:59:53 +01:00
paboyle
90e70790f3 Feature for z-Mobius prep 2016-08-15 22:31:29 +01:00
Guido Cossu
089f0ab582 Debugged HMC for Creutz relation 2016-07-28 16:44:41 +01:00
Guido Cossu
b93e18ed50 Modified the Dirac Kernel class to compile with different number of colours
Added the general push_back functionality to accomodate for all defined representations

Compiles, not tested
2016-07-18 16:36:28 +01:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
paboyle
53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
paboyle
165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle
fc6ad65751 Pushed the overlap comms tweaks 2016-01-11 06:34:22 -08:00
paboyle
331768dcff Added overlap comms compute mode 2016-01-03 01:38:11 +00:00
paboyle
aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
paboyle
34a0fde2ad Fixes to fermion force terms after sign of gamma_mu (0...3) change.
Thought I had already committed these.

Believe I have got the Gparity fermion force working.

* tests/Test_gpdwf_force.cc     -- correctly predicts dS for two flavour pseudofermion
                                   based on a small dt update of U field.

* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.

Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
paboyle
3ce10aa975 Fix a regression failure on Mobius; chroma regression added 2015-12-10 22:55:00 +00:00
paboyle
05a7029600 Stencil change 2015-11-07 00:06:31 -08:00
paboyle
899ca41cb8 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
paboyle
17af18dcab Changes for AVX512 assembler 2015-11-06 03:45:51 -08:00
Peter Boyle
28022755ae Stencil class name global change to StencilImpl typedef 2015-11-06 05:30:17 -06:00