Peter Boyle
b9dcad89e8
Test cases for coarsening with non-local stencil
2023-09-07 10:53:22 -04:00
Peter Boyle
2b43308208
First cut non-local coarsening
2023-08-25 17:38:07 -04:00
Christopher Kelly
f44dce390f
Implemented acclerator-optimized versions of localCopyRegion and insertSliceLocal to speed up padding
...
Fixed const correctness on PaddedCell methods
Fixed compile issues on Crusher
Added timing breakdowns for PaddedCell::Expand and the padded implementations of the staples, visible under --log Performance
Optimized kernel for StaplePadded
Test_iwasaki_action_newstaple now repeats the calculation 10 times and reports average timings
2023-06-27 14:58:10 -04:00
Christopher Kelly
6f6844ccf1
Added new StapleAll and RectStapleAll functions that return the staples for all mu as an array
...
Modified plaq+rectangle gauge actions to use the above
Added a test code to confirm the above changes
2023-06-26 15:48:47 -04:00
Christopher Kelly
4c6613d72c
Modified RectStapleDouble and RectStapleOptimised to use Gauge-BC respecting CshiftLink
...
Added test code tests/debug/Test_optimized_staple_gaugebc demonstrating equivalence of above to RectStapleUnoptimised for cconj gauge BCs
Removed optimized staple only being used for periodic gauge BCs; it is now always used
2023-06-26 10:20:23 -04:00
Christopher Kelly
4241c7d4a3
Imported coalescedReadGeneralPermute GPU implementation from Christoph
...
Fixed bug in padded staple code where extract was being called on the result before the GPU view was closed
Fixed compile issue with pointer cast in padded staple code
Added timing summaries of padded staple code and timing breakdown of staple implementation to Test_padded_cell_staple
2023-06-21 16:01:01 -04:00
Christopher Kelly
7b11075102
The user can now specify the implementation of Cshift used by the PaddedCell class through a virtual base class API. Implementations for default (regular Cshift) and for gauge links (which respects the gauge BCs)
...
Fixed const-correctness for PaddedCell and ConjugateGimpl::setDirections
Modified test code for padded-cell implementation of staple, rect-staple to use cconj BCs
2023-06-20 17:09:56 -04:00
Christopher Kelly
abc658dca5
Added coalescedReadGeneralPermute CPU implementation based on Christoph's GPT code
...
In a test code, implemented a padded-cell version of the staple and rectangular-staple calculation
2023-06-20 16:14:25 -04:00
david clarke
c7bdf2c0e4
3-link test at least gives an answer
2023-05-21 04:33:20 -06:00
Peter Boyle
9c8750f261
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2023-05-11 12:29:09 -04:00
Peter Boyle
ccd21f96ff
Plaquette agreeing and moving to final form (slowly) need to optimise
2023-02-01 22:57:44 -05:00
Peter Boyle
4b90cb8888
First cut passes combining padded cell with general stencil towards fast plaquette and staggered force
2023-02-01 22:14:10 -05:00
Peter Boyle
3dbfce5223
Tests clean build on HIP
2022-11-16 20:15:51 -05:00
Peter Boyle
8cd4263974
Tests compile
2021-04-25 22:20:37 -04:00
Michael Marshall
2983b6fdf6
Optional (superficial) changes to make comparison with Hadrons WardIdentity module easier: use Schur solver; example of Hadrons random gauge init; logging updates; only solve reverse propagator if provided
2021-01-23 12:41:48 +00:00
Peter Boyle
11a5fd09d6
Hot config
2021-01-21 21:39:41 -05:00
Michael Marshall
873519e960
Enable existing conserved current code for CUDA (compiles OK for CUDA 10.1). Add option to Test_cayley_mres to load a configuration
2020-12-14 16:06:10 +00:00
Peter Boyle
d201277652
Expose Nc as a compile time configure option.
...
Remove precision option
2020-10-07 13:07:00 -04:00
Peter Boyle
d982a5b6d5
Fix coaarsened
2020-09-01 00:14:04 -04:00
Peter Boyle
1a4c8c3387
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
2020-06-05 18:52:35 -04:00
Peter Boyle
f999408e92
View locatoin and access mode
2020-05-21 16:14:20 -04:00
Peter Boyle
29ae5615c0
Seqeuential fix
2020-04-29 03:05:15 -04:00
Peter Boyle
ed70cce542
Test for 5D DWF obserevables
2020-04-23 04:29:45 -04:00
Peter Boyle
462900b48d
Modified entire test directory to suit new GPU constructs for looping
2019-06-15 12:53:27 +01:00
Peter Boyle
bcbb5e9d26
Remove assembly tests
2019-06-15 07:57:05 +01:00
Peter Boyle
422764757d
Updates in tests to make all of Grid compile
2018-12-14 16:55:54 +00:00
Peter Boyle
b57a4d32aa
Merge branch 'develop' into feature/gpu-port
2018-12-13 05:11:34 +00:00
Peter Boyle
68c13045d6
Added a test for Felix and Michael to look at
2018-11-07 23:40:15 +00:00
Peter Boyle
24c07694bc
Mixed precision now supported in MADWF
2018-10-14 00:22:52 +01:00
Peter Boyle
f0229025e2
MADWF working across a range of actions
2018-10-13 19:55:03 +01:00
Peter Boyle
49f25e08e8
PauliVillars based 4D -> 5D reconstruction with Fourier Accelerated PV inverse
...
by Christoph. Differs from the one by Rudy in BFM since it vectorises the twisted
4D solves in pairs.
2018-10-11 12:35:32 +01:00
paboyle
285deab432
Coordinate handling GPU friendly. Avoid std::vector
2018-02-24 22:19:28 +00:00
paboyle
dd8f2a64fe
INterface to suit hadrons on Lanczos
2018-02-13 02:08:49 +00:00
paboyle
98af36217a
Zero changes. (I mean literally)
2018-01-27 23:46:02 +00:00
paboyle
c4f82e072b
_grid becomes private ; use Grid()§
2018-01-27 00:04:12 +00:00
paboyle
3f9654e397
Hiding internals
2018-01-26 23:09:03 +00:00
paboyle
d74c21a386
GLobal edit for QCD namespace removal & NAMESPACE macros
2018-01-15 09:37:58 +00:00
paboyle
cb9ff20249
Approx tests and lanczos improvement
2017-10-13 11:30:50 +01:00
paboyle
9fe6ac71ea
Starting reorg of Blocked lanczos
2017-10-11 10:12:07 +01:00
David Murphy
459f70e8d4
Check-in of working Mobius EOFA class and tests
2017-08-22 22:38:30 -04:00
David Murphy
ec1e2f7a40
Add (mostly implemented) ExactOneFlavourRatio pseudofermion class and tests of Shamir heatbath and action
2017-08-16 12:38:59 -04:00
David Murphy
6d0786ff9d
Typo fixes and check-in of G-parity action test for DWF
2017-08-15 22:47:00 -04:00
David Murphy
202a7fe900
Re-import DWF and abstract base EOFA fermion classes and tests
2017-08-15 13:36:08 -04:00
paboyle
e8b95bd35b
Clean up finished. Could shrink Lanczos to around 400 lines at a push
2017-06-21 02:50:09 +01:00
Azusa Yamaguchi
0a8faac271
Fix make tests compile
2017-06-19 22:54:18 +01:00
paboyle
a8db024c92
Cleaning up the dense matrix and lanczos sector
2017-04-15 08:54:11 +01:00
paboyle
4b17e8eba8
Merge branch 'develop' into feature/bgq-asm
...
Conflicts:
lib/qcd/action/fermion/Fermion.h
lib/qcd/action/fermion/WilsonFermion.cc
lib/util/Init.cc
tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
paboyle
18bde08d1b
Merge branch 'feature/staggering' into develop
2017-03-28 15:25:55 +09:00
paboyle
447c5e6cd7
Z mobius hermiticity correction
2017-03-13 01:30:43 +00:00
a37e71f362
New automatic implementation of gamma matrices, Meson and SeqGamma are broken
2017-01-23 19:13:43 -08:00
paboyle
8a337f3070
Move cayley into mainstream tests
2016-12-18 02:35:31 +00:00
Azusa Yamaguchi
389e0a77bd
Staggerd Fermion 5D
2016-11-29 13:13:56 +00:00
Guido Cossu
0fd179fb33
Merge branch 'develop' into feature/hirep
2016-09-01 12:59:53 +01:00
Guido Cossu
fd5614738d
Merge branch 'develop' into feature/hirep
2016-08-30 18:21:36 +01:00
paboyle
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
629283726b
build system: local Grid link flag moved to configure.ac
2016-08-03 15:07:42 +01:00
9e5b934d21
improved LAPACK configuration
2016-08-02 17:26:54 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
paboyle
35d0d35238
Updated file list
2016-07-15 00:02:53 +01:00
paboyle
3493b51879
Modest updates
2016-07-14 23:52:13 +01:00
paboyle
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00