Peter Boyle
cedeaae7db
Lebesge -> StencilView if necessary
2018-03-24 19:32:41 -04:00
Peter Boyle
8a1d303ab9
GPU friendly stencil improvements
2018-03-19 07:11:03 -04:00
paboyle
3277bda130
View introduction to prepare for accelerator offload.
...
Probably same problem exists for stencil object
2018-03-04 16:38:08 +00:00
paboyle
078901278c
Coordinate handling gpu friendly
2018-02-24 22:22:02 +00:00
paboyle
14ba20898a
Accelerator loop the key kernel call
2018-02-02 11:30:07 +00:00
paboyle
a53d3ee19a
Add Opt to the lambda capture to get it into the GPU
2018-02-02 11:28:39 +00:00
paboyle
70e276e1ab
parallel_for elimination -> thread_loop
2018-01-28 01:01:14 +00:00
paboyle
2d0bcc2606
Zero changes, acceleartor on kernels and some thread loop changes
2018-01-27 23:47:38 +00:00
paboyle
c4f82e072b
_grid becomes private ; use Grid()§
2018-01-27 00:04:12 +00:00
paboyle
85771e97e9
Hide internal data
2018-01-26 23:04:46 +00:00
paboyle
87ee592176
Pragma changes and layout and warning elimination for nvcc
2018-01-24 13:14:09 +00:00
paboyle
4de58c4aab
Namespace
2018-01-14 23:06:47 +00:00
Vera Guelpers
2cfb50cbe5
bug fix in sequential insertion of conserved vector current
2017-12-08 11:13:39 +00:00
Lanny91
1bd311ba9c
Faster sequential conserved current implementation, now compatible with 5D vectorisation & G-parity.
2017-06-16 16:43:15 +01:00
Lanny91
41af8c12d7
Code cleaning for conserved current contractions. Will now be easier to implement mobius conserved current.
2017-06-16 16:38:59 +01:00
Lanny91
5633a2db20
Faster implementation of conserved current site contraction. Added 5D vectorised support, but not G-parity.
2017-06-12 10:41:02 +01:00
Lanny91
23135aa58a
Merge branch 'feature/hadrons' of https://github.com/paboyle/Grid into feature/rare_kaon
2017-05-26 16:00:50 +01:00
Lanny91
44260643f6
First conserved current implementation for Wilson fermions only. Not implemented for Gparity or 5D-vectorised Wilson fermions.
2017-04-25 18:00:24 +01:00
Guido Cossu
8c540333d5
Merge branch 'develop' into feature/hmc_generalise
2017-04-05 14:41:04 +01:00
paboyle
4b17e8eba8
Merge branch 'develop' into feature/bgq-asm
...
Conflicts:
lib/qcd/action/fermion/Fermion.h
lib/qcd/action/fermion/WilsonFermion.cc
lib/util/Init.cc
tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
paboyle
18bde08d1b
Merge branch 'feature/staggering' into develop
2017-03-28 15:25:55 +09:00
paboyle
e099dcdae7
Merge branch 'develop' into feature/bgq-asm
2017-02-23 00:25:29 +00:00
paboyle
4e7ab3166f
Refactoring header layout
2017-02-22 18:09:33 +00:00
paboyle
3ae92fa2e6
Global changes to parallel_for structure.
...
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
Guido Cossu
e0571c872b
Merge branch 'develop' into feature/hmc_generalise
2017-02-09 16:12:00 +00:00
paboyle
2c246551d0
Overlap comms and compute options in wilson kernels
2017-02-07 01:37:10 -05:00
a0cfbb6e88
Merge branch 'feature/gammas' into feature/hadrons
...
# Conflicts:
# .gitignore
# lib/qcd/spin/Dirac.cc
# scripts/filelist
2017-01-30 09:10:49 -08:00
fad743fbb1
Build system sanity check: corrected several headers not in the <Grid/*> format
2017-01-26 17:00:41 -08:00
Guido Cossu
17629b8d9e
Merge branch 'develop' into feature/hmc_generalise
2017-01-25 11:33:53 +00:00
a37e71f362
New automatic implementation of gamma matrices, Meson and SeqGamma are broken
2017-01-23 19:13:43 -08:00
Guido Cossu
27dfe816fa
Added TwoFlavorsEO
...
Had to remove a conformability check in the Derivative of SchurDiff,
see the comments in the file
2017-01-20 16:59:31 +00:00
Peter Boyle
fb8d4b2357
Lots of debug on performance Mobius
2016-12-08 17:28:28 +00:00
Guido Cossu
143c70e29f
Debugged the threaded version. Cleaning up
2016-12-07 04:40:25 +00:00
Azusa Yamaguchi
668ca57702
Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering
2016-11-22 13:49:11 +00:00
Azusa Yamaguchi
164d3691db
Staggered
2016-11-01 14:24:22 +00:00
ca21003f01
Merge branch 'feature/fft-opt' into feature/feynman-rules
...
# Conflicts:
# lib/FFT.h
# lib/qcd/action/fermion/WilsonFermion5D.h
# tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
azusayamaguchi
c190221fd3
Internal SHM comms in non-simd directions working
...
Need to fix simd directions
2016-10-22 18:14:27 +01:00
997fd882ff
Merge branch 'develop' into feature/feynman-rules
...
# Conflicts:
# lib/Threads.h
# lib/qcd/action/fermion/WilsonFermion.cc
# lib/qcd/action/fermion/WilsonFermion.h
# lib/qcd/utils/SUn.h
# lib/simd/Grid_avx.h
# lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
paboyle
616e7cd83e
Mass parameter
2016-10-10 23:45:48 +01:00
paboyle
b6713ecb60
Momentum space rules for Overlap, DWF untested to date
2016-09-26 09:39:09 +01:00
Guido Cossu
b6597b74e7
Added support for the Two index Symmetric and Antisymmetric representations
...
Tested for HMC convergence: OK
Added also a test file showing an example for mixed representations
2016-09-22 14:17:37 +01:00
paboyle
0c1d7e4daf
Mom space prop for Wilson action
2016-08-31 00:26:36 +01:00
Guido Cossu
b93e18ed50
Modified the Dirac Kernel class to compile with different number of colours
...
Added the general push_back functionality to accomodate for all defined representations
Compiles, not tested
2016-07-18 16:36:28 +01:00
Guido Cossu
ffedeb1c58
Minor modifications
2016-07-06 11:41:27 +01:00
paboyle
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
paboyle
9b6ab6db16
simd in 5th dimension support
2016-04-19 15:38:01 -07:00
paboyle
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
paboyle
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00