nmeyer-ur
|
5ee3ea2144
|
round-up after testing of prefetches in stencil close
|
2020-06-03 11:58:20 +02:00 |
|
Peter Boyle
|
c5c2dbc0ce
|
Optional CUDA info
|
2020-06-02 14:21:49 -04:00 |
|
Christoph Lehner
|
9fcb47ee63
|
Explicit error message instead of infinite loop in GlobalSharedMemory::GetShmDims
|
2020-06-02 07:44:38 -04:00 |
|
nmeyer-ur
|
5050833b42
|
revert changes due to performance penalty in Wilson using MPI
|
2020-06-02 13:08:57 +02:00 |
|
nmeyer-ur
|
7bee4ebb54
|
correct predication for svcadd
|
2020-06-02 10:51:39 +02:00 |
|
nmeyer-ur
|
71cf9851e7
|
correct type for vecd in TimesI and TimesMinusI
|
2020-06-02 10:44:15 +02:00 |
|
nmeyer-ur
|
b4735c9904
|
correct zero in svcadd
|
2020-06-02 10:38:05 +02:00 |
|
nmeyer-ur
|
9b2699226c
|
use fcadd in TimesI and TimesMinusI instead of tbl and neg
|
2020-06-02 10:32:44 +02:00 |
|
nmeyer-ur
|
5f52804907
|
update calculation of data
|
2020-05-30 10:55:17 +02:00 |
|
nmeyer-ur
|
936071773e
|
correct throughput in wilson and dwf
|
2020-05-29 22:15:59 +02:00 |
|
nmeyer-ur
|
1732f9319e
|
more mods; counters seem to work correctly
|
2020-05-29 18:44:00 +02:00 |
|
nmeyer-ur
|
91c81cab30
|
some corrections; compiles on my laptop; untested
|
2020-05-29 18:19:22 +02:00 |
|
nmeyer-ur
|
38164f8480
|
include counters in WilsonFermionImplementation.h
|
2020-05-29 17:59:26 +02:00 |
|
nmeyer-ur
|
f013979791
|
add counter support in WilsonFermion.h
|
2020-05-29 17:13:59 +02:00 |
|
nmeyer-ur
|
e947b563ea
|
add space in stencil output
|
2020-05-29 17:11:17 +02:00 |
|
nmeyer-ur
|
5cb3530c34
|
enable counters in Benchmark_wilson
|
2020-05-29 15:44:52 +02:00 |
|
nmeyer-ur
|
250008372f
|
update SVE readme
|
2020-05-29 15:44:25 +02:00 |
|
Peter Boyle
|
1d252d0922
|
Accelerator inline
|
2020-05-28 11:45:25 -04:00 |
|
Peter Boyle
|
006cc8a8f1
|
Staggereed move to accelerator
|
2020-05-28 08:33:06 -04:00 |
|
nmeyer-ur
|
4fedd8d29f
|
switch to MPI_THREAD_SERIALIZED instead of SINGLE
|
2020-05-27 14:08:34 +02:00 |
|
Peter Boyle
|
cf2938688a
|
Sycl unhappy fix
|
2020-05-25 08:36:53 -07:00 |
|
Peter Boyle
|
ee63721bad
|
int unhappiness sycl fix
|
2020-05-25 08:36:24 -07:00 |
|
Peter Boyle
|
22c5168d70
|
Sycl happier
|
2020-05-25 08:35:56 -07:00 |
|
Peter Boyle
|
949ac3cd24
|
Must avoid non-trivial copy constructors
|
2020-05-25 08:35:28 -07:00 |
|
Peter Boyle
|
7bc0166c1c
|
SYCLL maknig happy - must avoid non ttrivial copy constructors
|
2020-05-25 08:34:19 -07:00 |
|
Peter Boyle
|
cb0d1b3399
|
hopefullly fix buildd fail
|
2020-05-24 21:27:00 -04:00 |
|
Peter Boyle
|
d1f1ccc705
|
HIP changes
|
2020-05-24 21:18:49 -04:00 |
|
Peter Boyle
|
c7519a237a
|
Assertions fail on HIP foor unknown reasons - dedbugging
|
2020-05-24 14:02:47 -04:00 |
|
Peter Boyle
|
32be2b13d3
|
Updates for HiP
|
2020-05-24 14:00:55 -04:00 |
|
Peter Boyle
|
92b342a477
|
Hip reduction too
|
2020-05-24 13:50:28 -04:00 |
|
Peter Boyle
|
556da86ac3
|
HIP fp16
|
2020-05-24 13:41:58 -04:00 |
|
Peter Boyle
|
8285e41574
|
View location / access mode
|
2020-05-21 16:14:41 -04:00 |
|
Peter Boyle
|
f999408e92
|
View locatoin and access mode
|
2020-05-21 16:14:20 -04:00 |
|
Peter Boyle
|
a7abda89e2
|
View location & access mode
|
2020-05-21 16:13:59 -04:00 |
|
Peter Boyle
|
7860a50f70
|
Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
|
2020-05-21 16:13:16 -04:00 |
|
nmeyer-ur
|
6ddcef1bca
|
fix build error enabling fcmla/mac in vector types for VLA
|
2020-05-21 21:21:03 +02:00 |
|
nmeyer-ur
|
8c5a5fdfce
|
disable fcmla in vector type building for VLA
|
2020-05-21 19:41:42 +02:00 |
|
nmeyer-ur
|
046b1cbbc0
|
enable fcmla in tensor arithmetics; fixed-size works, VLA does not compile
|
2020-05-21 19:39:07 +02:00 |
|
nmeyer-ur
|
a65ce237c1
|
clean up; Exch1 VLA sp+dp integrate, tested, working
|
2020-05-21 09:48:06 +02:00 |
|
nmeyer-ur
|
cd27f1005d
|
clean up; Exch1 sp integrate, tested, working
|
2020-05-21 08:45:43 +02:00 |
|
nmeyer-ur
|
f8c0a59221
|
clean up; Exch1 dp integrate, tested, working
|
2020-05-21 02:48:14 +02:00 |
|
nmeyer-ur
|
832485699f
|
save some cycles in HtoD and DtoH by direct instead of multi-pass conversion
|
2020-05-20 23:04:35 +02:00 |
|
nmeyer-ur
|
81484a4760
|
symmetrize Mult and MultAddComplex
|
2020-05-20 22:36:45 +02:00 |
|
nmeyer-ur
|
9a86059761
|
symmetrize VLA and fixed size build messages
|
2020-05-20 20:05:42 +02:00 |
|
nmeyer-ur
|
b780b7b7a0
|
guard prevents multiple TOFU messages
|
2020-05-20 19:20:59 +02:00 |
|
nmeyer-ur
|
9e085bd04e
|
guard prevents multiple A64FX build messages
|
2020-05-20 19:16:30 +02:00 |
|
ferben
|
6c6812a5ca
|
GB/s output
|
2020-05-20 12:26:57 +01:00 |
|
Christoph Lehner
|
8358ee38c4
|
pull develop
|
2020-05-19 08:56:18 -04:00 |
|
ferben
|
1f154fe652
|
some cleanup in BaryonUtils
|
2020-05-19 13:48:56 +01:00 |
|
ferben
|
d708c0258d
|
some cleanup in BaryonUtils
|
2020-05-19 13:48:00 +01:00 |
|