d7bef70b5c
Helper functions to allow probe of cache state of lattice objects.
2021-11-09 12:57:09 +00:00
Peter Boyle
e9c4f06cbf
Merge pull request #370 from fjosw/bugfix/gpu_sum_shm
...
Error Handling sum_Dgpu large objects
2021-10-14 09:12:47 -04:00
1f9688417a
Error message added when attempting to sum object which is too large for
...
the shared memory
2021-10-13 20:45:46 +01:00
7e130076d6
Fixed line left behind
2021-09-24 17:26:31 +01:00
a822c48565
Added accelerated pick-set checkerboard functions
2021-09-24 17:13:25 +01:00
Christoph Lehner
dd091d0960
consistent pointer offloading instead of views
2021-09-15 16:58:05 +02:00
Christoph Lehner
e2abbf9520
Merge pull request #25 from paboyle/develop
...
Sync
2021-09-15 10:02:43 +02:00
Christoph Lehner
2bb374daea
hip-friendly
2021-03-19 11:33:23 +01:00
Peter Boyle
db3ac67506
Update thread issue
2021-03-12 14:55:07 +01:00
Peter Boyle
ce1fc1f48a
Possible fallback plan for Fionn's compiler bbug in nvcc
2021-03-11 22:20:53 +01:00
Thomas Wurm
9e5fb52eb9
Put GlobalSum outside the slice loop
2021-03-08 13:53:34 +01:00
Michael Marshall
1059a81a3c
Merge branch 'develop' into bugfix/LatTransfer
...
* develop:
Better SIMD usage/coalescence
2021-02-27 00:21:36 +00:00
Peter Boyle
f9b1f240f6
Better SIMD usage/coalescence
2021-02-26 17:51:41 +01:00
Michael Marshall
69f41469dd
Merge branch 'develop' into bugfix/LatTransfer
...
* develop: (26 commits)
Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
Correct misleading ac help string
Enable performance counting in WilsonFermion like in others
changed back A2AUtils warning
changed if and accelerator_for - no runtime errors any more
Mac OS (Darwin) sed -i flag for in-place editing differs from posix / gnu
Seems the intention with AutoConf produced Grid/Config.h was to use sed to translate standard PACKAGE_ #defines into GRID_ however due to missing '' after -i this hasn't been working. Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons.
Fixed compile issues with maxLocalNorm2 for non-scalar lattices maxLocalNorm2 test now reuses the random field
MADWF 5d source option for hadrons - look at Grid of source Abort on GPU error
maxLocalNorm2()
change back benchmark_ITT
prettify
Flop cout matches DiRAC-ITT-2020
revert changes
merge develop
fixes
weird bug in 2pt function...
revert changes
final version, tested on CPU and GPU
bugfix
...
2021-02-25 09:19:17 +00:00
Christopher Kelly
55de69a569
Fixed compile issues with maxLocalNorm2 for non-scalar lattices
...
maxLocalNorm2 test now reuses the random field
2021-02-08 12:03:16 -05:00
Peter Boyle
cd99edcc5f
maxLocalNorm2()
2021-02-04 18:25:49 -05:00
Michael Marshall
3215d88a91
Simplify syntax with Grid::EnableIf post code review. Updated EnableIf so that ReturnType defaults to void in same way as std::enable_if see https://en.cppreference.com/w/cpp/types/enable_if
2021-02-03 15:17:03 +00:00
Michael Marshall
77063418da
Fix issue for GPU by ensuring accelerator_inline version of convertType is available for Grid::complex<T>. This removes many warnings in Hadrons
...
Simplify the SFINAE syntax and correct convertType for iScalar
2021-01-25 15:09:36 +00:00
Peter Boyle
cf76741ec6
Intel DPCPP Gold happy now (compiles all, runs Benchmark_dwf_fp32 )
2020-12-03 03:47:11 -08:00
Christoph Lehner
4ea8d128c2
Merge pull request #18 from paboyle/develop
...
Sync
2020-11-20 15:36:50 +01:00
Peter Boyle
a0ccbb3bd6
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-11-01 01:16:35 +00:00
Peter Boyle
5eeabaa2bb
HIP fix
2020-11-01 01:16:01 +00:00
Peter Boyle
cc9c993f74
Project on group fix on GPU tracked to reciprocal sqrt collision between CUDA and Grid rsqrt
2020-10-31 18:12:47 -04:00
Peter Boyle
3362f8dfa0
happy compile
2020-10-14 22:59:41 -04:00
Peter Boyle
a88b3ceca5
Closure cases
2020-10-14 21:33:51 -04:00
Peter Boyle
aa135412f5
toComplex, toReal
2020-10-13 22:25:01 -04:00
Peter Boyle
9945399e60
Reaality issues fix by drop from ET
2020-10-13 22:24:32 -04:00
Peter Boyle
5eeffa49e8
Reality forced included
2020-10-13 22:23:57 -04:00
Christoph Lehner
80fd6ab407
Merge pull request #17 from paboyle/develop
...
sync upstream
2020-10-06 09:01:39 +02:00
Peter Boyle
81441e98f4
HIP runs sensible
2020-09-16 03:35:03 +01:00
Peter Boyle
ecd3f890f5
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-09-16 02:30:14 +01:00
Peter Boyle
288c615782
Hip improvements
2020-09-16 00:31:50 +01:00
Christoph Lehner
51d1beb1f3
Merge pull request #15 from paboyle/develop
...
Sync with upstream
2020-09-07 14:20:33 +02:00
Peter Boyle
c273fb051c
Peek poke laattice
2020-09-01 15:27:59 -04:00
Peter Boyle
e21fef17df
real and imag part not in ET
2020-08-31 23:56:26 -04:00
Peter Boyle
7d14a3c086
Where working
2020-08-31 23:53:46 -04:00
Peter Boyle
9522dcd611
Remove dead commented ouot coode
2020-08-31 23:40:29 -04:00
Peter Boyle
ed469898dc
coalesced ET expressions
2020-08-31 23:38:40 -04:00
Peter Boyle
1eee94a809
Sorting real/im in read coalesced GPU ET
2020-08-31 23:36:49 -04:00
Peter Boyle
3448b7387c
Almost there to coalesced ET
2020-08-26 17:04:49 -04:00
Christoph Lehner
f0dc0f3621
fix compile issue on Qpace3
2020-08-22 13:57:33 +02:00
Christoph Lehner
dbaa24ebf6
further GPU memory access fixes (with this GPT passes all single-rank tests on non-summit GPUs)
2020-08-13 16:14:15 +02:00
Christoph Lehner
27b4fbf3f0
assert for forbidden code path and fix check for faster CPU codepath in basisRotate
2020-08-03 07:57:33 -04:00
Christoph Lehner
197612bc7a
fast cpu basisRotate and other small cleanups
2020-07-30 07:08:54 -04:00
Peter Boyle
936c5ecf69
Reduction GPU no compile fix
2020-06-24 17:28:31 -04:00
Peter Boyle
22cfbdbbb3
Boost precision in inner products in single
2020-06-24 12:52:31 -04:00
Peter Boyle
b949cf6b12
PeekLocal needs a view to keep thread safe.
...
ALLOCATION_CACHEE reenable
2020-06-19 17:13:27 -04:00
Christoph Lehner
b5e87e8d97
summit compile fixes
2020-06-12 18:16:12 -04:00
Christoph Lehner
5f5807d60a
cleanup
2020-06-12 14:48:23 -04:00
Christoph Lehner
7974acff54
merged sycl to feature-gpt
2020-06-12 06:49:38 -04:00