|
d5b2323a57
|
included Cayley-Hamilton exponentiation for the compact Wilson exp clover, bug fix for inverse of exp clover
|
2022-03-07 14:44:24 +00:00 |
|
|
438caab25f
|
generate_instantiations.sh now correctly produces instantiations for CompactClover variant, redundant instantiations removed.
|
2022-02-27 18:27:18 +00:00 |
|
Christoph Lehner
|
9616811c3d
|
Merge branch 'feature/gpt' of https://github.com/lehner/Grid into feature/gpt
|
2022-02-24 22:03:05 +01:00 |
|
Christoph Lehner
|
8a3002c03b
|
separate left and right masses for CayleyFermion5D
|
2022-02-24 22:02:56 +01:00 |
|
Peter Boyle
|
0f1c5b08a1
|
Dirichlet filters running on AMD and now integrated in Fermion op
|
2022-02-23 19:29:28 -05:00 |
|
Mattia Bruno
|
71034f828e
|
attempt to fix broken WilsonExpClover; Compact version still broken will be replaced by F.Joswig
|
2022-02-23 01:02:27 +01:00 |
|
Peter Boyle
|
aab3bcb46f
|
Dirichlet first cut - wrong answers on dagger multiply.
Struggling to get a compute node so changing systems
|
2022-02-22 19:58:33 +00:00 |
|
Mattia Bruno
|
11437930c5
|
cleaned up definitions of wilsonclover fermions
|
2022-02-22 10:45:16 +01:00 |
|
Mattia Bruno
|
3d44aa9cb9
|
cleaned up cloverhelpers; fixed test compact_clover which runs
|
2022-02-22 01:10:19 +01:00 |
|
Mattia Bruno
|
2851870d70
|
expClover support via helpers template class
|
2022-02-22 00:05:43 +01:00 |
|
Peter Boyle
|
e8c187b323
|
SyCL happier?
|
2022-02-15 11:24:38 -05:00 |
|
Peter Boyle
|
c322420580
|
Dont instantiate an Nc=3 and non-GP hardwired code for other implementations
|
2022-02-14 16:04:08 +00:00 |
|
Daniel Richtmann
|
1b6b12589f
|
Get splitting up into implementation and instantiation files correct
|
2022-02-02 00:51:11 +01:00 |
|
Daniel Richtmann
|
3082ab8252
|
Check in compact version of wilson clover fermions
|
2022-02-02 00:50:05 +01:00 |
|
Daniel Richtmann
|
add86cd7f4
|
Abandon ET for clover application, use construct similar to multLink
|
2022-02-01 23:09:06 +01:00 |
|
Daniel Richtmann
|
0b6fd20c54
|
Enable memory coalescing in clover term generation
|
2022-02-01 23:09:06 +01:00 |
|
Daniel Richtmann
|
e83423fee6
|
Refactor clover to align with other files and prepare for upcoming changes
|
2022-02-01 23:09:06 +01:00 |
|
Peter Boyle
|
195ab2888d
|
Merge branch 'develop' into develop
|
2021-10-27 20:26:57 -04:00 |
|
Peter Boyle
|
ba7e371b90
|
Warning free compile on Tursa.
Hopefully got all reqd virtual dtors
|
2021-10-21 19:56:52 +01:00 |
|
Luchang Jin
|
4b24800132
|
AVX512 drop mixed precision as well
|
2021-09-15 16:29:47 -04:00 |
|
Christoph Lehner
|
3d0f88e702
|
A64FX drop mixed precision as well
|
2021-09-15 18:38:32 +02:00 |
|
Peter Boyle
|
86e33c8ab2
|
Significant GPU perf speed up finished
|
2021-09-14 16:14:23 +01:00 |
|
Peter Boyle
|
a7b943b33e
|
Remove half prec comms
|
2021-09-14 05:05:33 +01:00 |
|
Peter Boyle
|
7440cde92f
|
No half prec comms; coalesced access on GPU
|
2021-09-14 05:04:56 +01:00 |
|
Peter Boyle
|
4c88104a73
|
Fix compile warns
|
2021-09-11 23:08:05 +01:00 |
|
Peter Boyle
|
73b944c152
|
Drop half prec comms for now.
|
2021-09-11 23:07:18 +01:00 |
|
Peter Boyle
|
d1b0b7f5c6
|
Half prec comms dropping
|
2021-09-11 23:05:40 +01:00 |
|
Peter Boyle
|
381d8797d0
|
Drop half prec comms for now
|
2021-09-11 23:05:02 +01:00 |
|
Andrew Yong
|
770680669d
|
Whitespace removal.
|
2021-08-04 09:21:59 +01:00 |
|
Andrew Yong
|
0cdfc5cf22
|
Merge remote-tracking branch 'upstream/develop' into develop
|
2021-07-30 14:40:55 +01:00 |
|
u61464
|
8cfc7342cd
|
staggered hand unroll read coalesce
|
2021-05-05 14:17:18 -07:00 |
|
|
54c6b1376d
|
Quick fix of conserved current implementation in CayleyFermion5D. Now function treats current insertion with appropriate periodic boundary conditions in the mu=3 direction.
|
2021-04-21 16:56:46 +01:00 |
|
|
f3f11b586f
|
Tadpole sign now in front of forward hopping term to be consistent with previous implementation and analytic form.
|
2021-04-17 12:44:27 +01:00 |
|
|
8083e3f7e8
|
Sign factor for tadpole implementation corrected.
|
2021-04-15 11:14:31 +01:00 |
|
|
895244ecc3
|
Merge with upstream; implemented conserved tadpole for Shamir action.
|
2021-04-06 13:46:33 +01:00 |
|
|
addeb621a7
|
Implemented tadpole operator for Shamir action.
|
2021-04-06 13:45:37 +01:00 |
|
Peter Boyle
|
bb89a82a07
|
Staggered coalseced read
|
2021-03-29 20:01:15 +02:00 |
|
Peter Boyle
|
51f506553c
|
Read out the local ID once, and store
|
2021-03-12 15:33:04 +01:00 |
|
u61464
|
0e21adb3f6
|
Gives 200GF/s on SyCL/DG1 8^4, doesn't uglify develop for other platforms too badly.
Easy to revert to clean more C++ stylistic code. Theres a SYCL_HACK macro I will clean up later once dpcpp
evolves a central nervous systems.
|
2021-03-10 05:40:51 -08:00 |
|
Peter Boyle
|
a9604367c1
|
Merge pull request #336 from lehner/feature/gpt
Make ShmDims configurable; adjust GRID_MAX_SIMD to allow for 128 byte width on GPUs
|
2021-03-05 13:17:19 -05:00 |
|
u61464
|
679d1d22f7
|
Sycl happier
|
2021-03-03 11:21:43 -08:00 |
|
Peter Boyle
|
442336bd96
|
Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case.
Other cases to do. This now includes comms code path.
|
2021-03-02 14:50:51 +01:00 |
|
Christoph Lehner
|
9c9566b9c9
|
Merge pull request #23 from paboyle/develop
Sync
|
2021-03-01 12:33:51 +01:00 |
|
Daniel Richtmann
|
e3d019bc2f
|
Enable performance counting in WilsonFermion like in others
|
2021-02-22 15:25:40 +01:00 |
|
Peter Boyle
|
eda9ab487b
|
MADWF 5d source option for hadrons - look at Grid of source
Abort on GPU error
|
2021-02-08 10:47:22 -05:00 |
|
Peter Boyle
|
ff1fa98808
|
Fix for GPU conserveed current
|
2021-01-21 21:38:23 -05:00 |
|
Christoph Lehner
|
299d0de066
|
Merge pull request #21 from paboyle/develop
Sync
|
2020-12-22 20:59:15 +01:00 |
|
Nils Meyer
|
45d49d8648
|
clean up
|
2020-12-19 03:35:18 +01:00 |
|
Nils Meyer
|
3f9ae6e7e7
|
Merge branch 'develop' into feature/a64fx-3
|
2020-12-19 02:37:11 +01:00 |
|
Nils Meyer
|
4dd9e39e0d
|
up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1
|
2020-12-19 00:54:31 +01:00 |
|