1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-21 17:22:03 +01:00
Commit Graph

374 Commits

Author SHA1 Message Date
addeb621a7 Implemented tadpole operator for Shamir action. 2021-04-06 13:45:37 +01:00
bb89a82a07 Staggered coalseced read 2021-03-29 20:01:15 +02:00
51f506553c Read out the local ID once, and store 2021-03-12 15:33:04 +01:00
0e21adb3f6 Gives 200GF/s on SyCL/DG1 8^4, doesn't uglify develop for other platforms too badly.
Easy to revert to clean more C++ stylistic code. Theres a SYCL_HACK macro I will clean up later once dpcpp
evolves a central nervous systems.
2021-03-10 05:40:51 -08:00
a9604367c1 Merge pull request #336 from lehner/feature/gpt
Make ShmDims configurable; adjust GRID_MAX_SIMD to allow for 128 byte width on GPUs
2021-03-05 13:17:19 -05:00
679d1d22f7 Sycl happier 2021-03-03 11:21:43 -08:00
442336bd96 Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case.
Other cases to do. This now includes comms code path.
2021-03-02 14:50:51 +01:00
9c9566b9c9 Merge pull request #23 from paboyle/develop
Sync
2021-03-01 12:33:51 +01:00
e3d019bc2f Enable performance counting in WilsonFermion like in others 2021-02-22 15:25:40 +01:00
eda9ab487b MADWF 5d source option for hadrons - look at Grid of source
Abort on GPU error
2021-02-08 10:47:22 -05:00
ff1fa98808 Fix for GPU conserveed current 2021-01-21 21:38:23 -05:00
299d0de066 Merge pull request #21 from paboyle/develop
Sync
2020-12-22 20:59:15 +01:00
45d49d8648 clean up 2020-12-19 03:35:18 +01:00
3f9ae6e7e7 Merge branch 'develop' into feature/a64fx-3 2020-12-19 02:37:11 +01:00
4dd9e39e0d up to +36% performance gain for dslash/dwf on QPACE 4 using GCC 10.1.1 2020-12-19 00:54:31 +01:00
873519e960 Enable existing conserved current code for CUDA (compiles OK for CUDA 10.1). Add option to Test_cayley_mres to load a configuration 2020-12-14 16:06:10 +00:00
c438118fd7 Change access specifier of clover fields in order to allow deriving classes to access these 2020-12-08 14:42:11 +01:00
2ef1fa66a8 Improved performance of G-parity kernel for GPUs by simplifying multLink implementation 2020-12-07 11:53:35 -05:00
b3881d2636 Thread inversion of clover term 2020-10-30 16:18:58 +01:00
bf3c9857e0 Closure changes 2020-10-14 21:37:14 -04:00
ace9cd64bb dpcpp happy 2020-09-29 08:03:46 -07:00
ecd3f890f5 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2020-09-16 02:30:14 +01:00
2859955a03 HIP requires "inline" 2020-09-16 00:36:13 +01:00
cc220abd1d inline for HIP 2020-09-16 00:35:38 +01:00
d1c0c0197e HipCC requires inline on definition 2020-09-16 00:35:06 +01:00
fd9424ef27 innlines required to make HIP happy 2020-09-16 00:34:32 +01:00
a5c35c4024 Make HIP / Vega happy 2020-09-16 00:33:53 +01:00
b4255140d6 Stale data member eliminated 2020-09-03 15:47:46 -04:00
0e88bf4bff remove Nils's default pragma 2020-07-29 10:24:35 -04:00
bbd145382b enable --enable-simd=A64FX in configure 2020-07-08 12:43:51 +02:00
8726e94ea7 merge upstream develop 2020-07-07 20:26:47 +02:00
b949cf6b12 PeekLocal needs a view to keep thread safe.
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
1aa988b2af Comms overlap fix UVM case 2020-06-19 01:21:14 -04:00
fd97f64612 Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl 2020-06-10 12:58:13 -04:00
8720aecb80 Offload more loops 2020-06-10 12:57:55 -04:00
cdf0a04fc5 Merge branch 'develop' into sycl 2020-06-09 04:00:12 -04:00
e97f3688db Fix the HMC issue - kernel was launchnig asynchronously 2020-06-08 17:01:15 -04:00
433766ac62 revert Add/SubTimesI and prefetching in stencil
This reverts commit 9b2699226c.
2020-06-08 12:02:53 +02:00
1a4c8c3387 Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes. 2020-06-05 18:52:35 -04:00
5ee3ea2144 round-up after testing of prefetches in stencil close 2020-06-03 11:58:20 +02:00
91c81cab30 some corrections; compiles on my laptop; untested 2020-05-29 18:19:22 +02:00
38164f8480 include counters in WilsonFermionImplementation.h 2020-05-29 17:59:26 +02:00
f013979791 add counter support in WilsonFermion.h 2020-05-29 17:13:59 +02:00
1d252d0922 Accelerator inline 2020-05-28 11:45:25 -04:00
006cc8a8f1 Staggereed move to accelerator 2020-05-28 08:33:06 -04:00
7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
9e085bd04e guard prevents multiple A64FX build messages 2020-05-20 19:16:30 +02:00
82f71643a4 Remove the norm in MdagM 2020-05-12 17:55:53 -04:00
20d1941a45 enabled asm kernels for fixed-size A64FXFIXEDSIZE 2020-05-12 19:01:12 +02:00
bbbee5660d First compiile on HiP 2020-05-10 05:28:09 -04:00