Peter Boyle
|
237ce92540
|
Offload loops
|
2020-06-10 19:59:11 -04:00 |
|
Peter Boyle
|
a7ffc61e82
|
acceleratorSIMTlane()
|
2020-06-10 19:58:33 -04:00 |
|
Peter Boyle
|
fd97f64612
|
Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl
|
2020-06-10 12:58:13 -04:00 |
|
Peter Boyle
|
8720aecb80
|
Offload more loops
|
2020-06-10 12:57:55 -04:00 |
|
Peter Boyle
|
cdf0a04fc5
|
Merge branch 'develop' into sycl
|
2020-06-09 04:00:12 -04:00 |
|
Peter Boyle
|
616d3dd737
|
CCommpile updates
|
2020-06-08 18:57:41 -04:00 |
|
Peter Boyle
|
8b066baca8
|
Implement transient mechanism
|
2020-06-08 18:28:53 -04:00 |
|
Peter Boyle
|
e97f3688db
|
Fix the HMC issue - kernel was launchnig asynchronously
|
2020-06-08 17:01:15 -04:00 |
|
nmeyer-ur
|
433766ac62
|
revert Add/SubTimesI and prefetching in stencil
This reverts commit 9b2699226c .
|
2020-06-08 12:02:53 +02:00 |
|
nmeyer-ur
|
93a37c8f68
|
test prefetch to L2 in stencil
|
2020-06-08 09:39:50 +02:00 |
|
Peter Boyle
|
89a1e78390
|
Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl
|
2020-06-05 23:20:37 -04:00 |
|
Peter Boyle
|
ffbb3fc02c
|
Merge pull request #287 from felixerben/baryon-cleaner
slightly cleaner baryon 2pt code
|
2020-06-05 22:54:52 -04:00 |
|
Peter Boyle
|
5a73ef3647
|
Minor tweak to compile
|
2020-06-05 21:50:15 -04:00 |
|
Peter Boyle
|
87e5d2f4b7
|
Merge branch 'sycl' of https://www.github.com/paboyle/Grid into sycl
|
2020-06-05 17:32:21 -07:00 |
|
Peter Boyle
|
d720f10758
|
Liink error fix
|
2020-06-05 17:29:20 -07:00 |
|
Peter Boyle
|
14fcd0912a
|
Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl
|
2020-06-05 19:14:17 -04:00 |
|
Peter Boyle
|
3111c0bd4f
|
Single precisiono hardwire
|
2020-06-05 19:13:27 -04:00 |
|
Peter Boyle
|
e03064490e
|
Merge branch 'sycl' of https://github.com/paboyle/Grid into sycl
|
2020-06-05 18:53:39 -04:00 |
|
Peter Boyle
|
1a4c8c3387
|
Global edit with change to View usage. autoView() creates a wrapper object that closes the view when scope closes.
|
2020-06-05 18:52:35 -04:00 |
|
Peter Boyle
|
2b1e259441
|
Decode of SYCL devices fix
|
2020-06-04 17:16:55 -07:00 |
|
Peter Boyle
|
f39c2a240b
|
Priintinig and device memory size detection
|
2020-06-04 14:58:03 -04:00 |
|
Peter Boyle
|
0d95805cde
|
Print improvement
|
2020-06-03 22:50:32 -04:00 |
|
Peter Boyle
|
f67830587f
|
Accelerator loop use
|
2020-06-03 22:50:09 -04:00 |
|
Peter Boyle
|
6bf7f839ff
|
Better printing and logging
|
2020-06-03 09:28:57 -04:00 |
|
Peter Boyle
|
e3147881a9
|
Cache scheme
|
2020-06-03 09:23:48 -04:00 |
|
nmeyer-ur
|
9872c76825
|
introduce AddTimesI and SubTimesI; slight benefit in operators, but < 1%; breaks all other impls
|
2020-06-03 15:20:13 +02:00 |
|
Peter Boyle
|
fb559614ad
|
Initialise meemory manager
|
2020-06-03 09:12:47 -04:00 |
|
Peter Boyle
|
e93e12b6a4
|
More verbose SYCL setup
|
2020-06-03 09:12:11 -04:00 |
|
Peter Boyle
|
0c3112cd94
|
Use view mechanism
|
2020-06-03 09:11:51 -04:00 |
|
Peter Boyle
|
8cfd5d2639
|
Need lattice view
|
2020-06-03 09:11:28 -04:00 |
|
Peter Boyle
|
1c9f20b15e
|
Views must be closed
|
2020-06-03 09:10:29 -04:00 |
|
Peter Boyle
|
32237895bd
|
Reorg memory manager for O(1) hash table
|
2020-06-03 09:09:52 -04:00 |
|
nmeyer-ur
|
5ee3ea2144
|
round-up after testing of prefetches in stencil close
|
2020-06-03 11:58:20 +02:00 |
|
Peter Boyle
|
c5c2dbc0ce
|
Optional CUDA info
|
2020-06-02 14:21:49 -04:00 |
|
Christoph Lehner
|
9fcb47ee63
|
Explicit error message instead of infinite loop in GlobalSharedMemory::GetShmDims
|
2020-06-02 07:44:38 -04:00 |
|
nmeyer-ur
|
5050833b42
|
revert changes due to performance penalty in Wilson using MPI
|
2020-06-02 13:08:57 +02:00 |
|
nmeyer-ur
|
7bee4ebb54
|
correct predication for svcadd
|
2020-06-02 10:51:39 +02:00 |
|
nmeyer-ur
|
71cf9851e7
|
correct type for vecd in TimesI and TimesMinusI
|
2020-06-02 10:44:15 +02:00 |
|
nmeyer-ur
|
b4735c9904
|
correct zero in svcadd
|
2020-06-02 10:38:05 +02:00 |
|
nmeyer-ur
|
9b2699226c
|
use fcadd in TimesI and TimesMinusI instead of tbl and neg
|
2020-06-02 10:32:44 +02:00 |
|
nmeyer-ur
|
5f52804907
|
update calculation of data
|
2020-05-30 10:55:17 +02:00 |
|
nmeyer-ur
|
936071773e
|
correct throughput in wilson and dwf
|
2020-05-29 22:15:59 +02:00 |
|
nmeyer-ur
|
1732f9319e
|
more mods; counters seem to work correctly
|
2020-05-29 18:44:00 +02:00 |
|
nmeyer-ur
|
91c81cab30
|
some corrections; compiles on my laptop; untested
|
2020-05-29 18:19:22 +02:00 |
|
nmeyer-ur
|
38164f8480
|
include counters in WilsonFermionImplementation.h
|
2020-05-29 17:59:26 +02:00 |
|
nmeyer-ur
|
f013979791
|
add counter support in WilsonFermion.h
|
2020-05-29 17:13:59 +02:00 |
|
nmeyer-ur
|
e947b563ea
|
add space in stencil output
|
2020-05-29 17:11:17 +02:00 |
|
nmeyer-ur
|
5cb3530c34
|
enable counters in Benchmark_wilson
|
2020-05-29 15:44:52 +02:00 |
|
nmeyer-ur
|
250008372f
|
update SVE readme
|
2020-05-29 15:44:25 +02:00 |
|
Peter Boyle
|
1d252d0922
|
Accelerator inline
|
2020-05-28 11:45:25 -04:00 |
|