b2b5137d28
Finally starting to get decent performance on Volta
2018-07-13 12:06:18 -04:00
b1265ae867
Prettify code
2018-07-05 07:08:06 -04:00
3e947527cb
Move looping over "s" and "site" into kernels for GPU optimisatoin
2018-06-27 21:29:43 +01:00
b710fec6ea
Gpu code first version of specialised kernel
2018-06-13 20:34:39 +01:00
eb7d34a4cc
GPU version
2018-05-14 19:41:47 -04:00
b15db11c60
Kernels -> pure static object to enable device execution
2018-03-24 19:35:20 -04:00
4e1272fabf
Kernels need to be static to work on GPU. No reference to host resident data
2018-03-22 18:44:53 -04:00
8a1d303ab9
GPU friendly stencil improvements
2018-03-19 07:11:03 -04:00
3277bda130
View introduction to prepare for accelerator offload.
...
Probably same problem exists for stencil object
2018-03-04 16:38:08 +00:00
dcf6517a93
Accelerator offload and copy Opt into the kernel for GPU host var safety
2018-02-02 11:35:35 +00:00
e4df025d01
Accelerator related
2018-02-01 23:20:05 +00:00
87ee592176
Pragma changes and layout and warning elimination for nvcc
2018-01-24 13:14:09 +00:00
a97ad1a51d
Namespce
2018-01-14 23:01:01 +00:00
1bd311ba9c
Faster sequential conserved current implementation, now compatible with 5D vectorisation & G-parity.
2017-06-16 16:43:15 +01:00
41af8c12d7
Code cleaning for conserved current contractions. Will now be easier to implement mobius conserved current.
2017-06-16 16:38:59 +01:00
5633a2db20
Faster implementation of conserved current site contraction. Added 5D vectorised support, but not G-parity.
2017-06-12 10:41:02 +01:00
ca1077c560
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/rare_kaon
...
# Conflicts:
# lib/qcd/action/fermion/WilsonFermion5D.cc
# tests/hadrons/Test_hadrons_rarekaon.cc
2017-05-04 16:22:33 +01:00
2ce898efa3
Pretty code
2017-04-26 02:34:25 -04:00
44260643f6
First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions.
2017-04-25 18:00:24 +01:00
abba44a837
Hand unrolled for overlapped comms
2017-04-22 17:45:17 +01:00
1d1b225497
Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).
2017-04-22 09:05:28 -04:00
736bf3c866
Major rework of stencil. Half precision and MPI3 now working.
2017-04-22 11:33:50 +01:00
2c246551d0
Overlap comms and compute options in wilson kernels
2017-02-07 01:37:10 -05:00
caba0d42a5
L1p controls
2016-12-22 17:52:55 +00:00
b7d55f7dfb
Fix a typo in reorg of the --dslash-asm
2016-11-04 11:35:08 +00:00
bb94ddd0eb
Tidy up of mpi3; also some cleaning of the dslash controls.
2016-11-02 08:07:09 +00:00
c190221fd3
Internal SHM comms in non-simd directions working
...
Need to fix simd directions
2016-10-22 18:14:27 +01:00
b58adc6a4b
commVector
2016-10-20 17:00:15 +01:00
c78bbd0f8c
Fix ASM compilation
2016-10-04 15:37:32 +01:00
b9c80318a2
Merge branch 'develop' into feature/hirep
2016-09-13 10:01:51 +01:00
f76f281e58
Cleaning files after fix
2016-09-09 11:34:25 +01:00
aa20cc8b52
Fixing compilation error with AVX512 flag
2016-09-09 02:58:52 -07:00
0fd179fb33
Merge branch 'develop' into feature/hirep
2016-09-01 12:59:53 +01:00
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
089f0ab582
Debugged HMC for Creutz relation
2016-07-28 16:44:41 +01:00
b93e18ed50
Modified the Dirac Kernel class to compile with different number of colours
...
Added the general push_back functionality to accomodate for all defined representations
Compiles, not tested
2016-07-18 16:36:28 +01:00
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
53d06046b0
Compiling updates for KNL
2016-06-03 03:47:54 -07:00
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
fc6ad65751
Pushed the overlap comms tweaks
2016-01-11 06:34:22 -08:00
331768dcff
Added overlap comms compute mode
2016-01-03 01:38:11 +00:00
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
34a0fde2ad
Fixes to fermion force terms after sign of gamma_mu (0...3) change.
...
Thought I had already committed these.
Believe I have got the Gparity fermion force working.
* tests/Test_gpdwf_force.cc -- correctly predicts dS for two flavour pseudofermion
based on a small dt update of U field.
* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.
Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
3ce10aa975
Fix a regression failure on Mobius; chroma regression added
2015-12-10 22:55:00 +00:00
05a7029600
Stencil change
2015-11-07 00:06:31 -08:00
899ca41cb8
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
17af18dcab
Changes for AVX512 assembler
2015-11-06 03:45:51 -08:00
28022755ae
Stencil class name global change to StencilImpl typedef
2015-11-06 05:30:17 -06:00