Christoph Lehner
|
d61ee817f4
|
Merge pull request #13 from DanielRichtmann/feature/gpt-coarsenedmatrix
Changes needed for GPT MG
|
2020-08-27 12:11:06 +02:00 |
|
Peter Boyle
|
3448b7387c
|
Almost there to coalesced ET
|
2020-08-26 17:04:49 -04:00 |
|
Peter Boyle
|
47b89d2739
|
Pragma protection improvementt
|
2020-08-26 17:04:27 -04:00 |
|
Christoph Lehner
|
2a75516330
|
state MPI/SLURM message only on world_rank zero
|
2020-08-26 12:34:17 -04:00 |
|
Daniel Richtmann
|
b2087f14c4
|
Fix CoarsenedMatrix regarding illegal memory accesses
Need a reference to geom since the lambda copies the this pointer which points to host memory, see
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/#star-this-capture
- https://devblogs.nvidia.com/new-compiler-features-cuda-8/
|
2020-08-24 17:46:47 +02:00 |
|
Daniel Richtmann
|
dd1ba266b2
|
Fix mapping between dir + disp and point in CMat
|
2020-08-24 17:46:46 +02:00 |
|
Daniel Richtmann
|
1292d59563
|
Add a typedef + broaden interface of CMat
|
2020-08-24 17:46:45 +02:00 |
|
Christoph Lehner
|
9877ed9bf8
|
Merge pull request #12 from paboyle/develop
Sync
|
2020-08-22 16:35:35 +02:00 |
|
Christoph Lehner
|
f0dc0f3621
|
fix compile issue on Qpace3
|
2020-08-22 13:57:33 +02:00 |
|
Peter Boyle
|
1efe30d6cc
|
SLurm stop nodes using same GPU
|
2020-08-21 02:02:53 +02:00 |
|
Peter Boyle
|
0b787e9fe0
|
Avoid namespaec collision to make gcc happy
|
2020-08-20 22:23:29 +02:00 |
|
Peter Boyle
|
37ec4b241c
|
Default thread count sensible
|
2020-08-20 22:12:31 +02:00 |
|
Christoph Lehner
|
06007db3d9
|
true shm_none implementation with GPUs that disables the use of device shared memory for the stencils
|
2020-08-14 18:37:00 +02:00 |
|
Christoph Lehner
|
12e6059a70
|
Merge branch 'feature/gpt' of https://github.com/lehner/Grid into feature/gpt
|
2020-08-13 16:16:52 +02:00 |
|
Christoph Lehner
|
dbaa24ebf6
|
further GPU memory access fixes (with this GPT passes all single-rank tests on non-summit GPUs)
|
2020-08-13 16:14:15 +02:00 |
|
Christoph Lehner
|
3b30b9f0c0
|
Merge branch 'feature/gpt' of https://github.com/lehner/Grid into feature/gpt
|
2020-08-06 16:59:17 +02:00 |
|
Christoph Lehner
|
69db4816f7
|
fix variable capture in Scatter_plane_merge on accelerators
|
2020-08-06 16:57:16 +02:00 |
|
Christoph Lehner
|
3abe09025a
|
when using SHM_NONE allow multiple ranks per node but without using shared memory
|
2020-08-06 14:42:38 +02:00 |
|
Christoph Lehner
|
e33878e0de
|
Trigger re-run of CI
|
2020-08-06 11:50:24 +02:00 |
|
Christoph Lehner
|
27b4fbf3f0
|
assert for forbidden code path and fix check for faster CPU codepath in basisRotate
|
2020-08-03 07:57:33 -04:00 |
|
Christoph Lehner
|
968a90633a
|
Zero -> zeroit in Tensor_index
|
2020-07-31 02:07:17 -04:00 |
|
Christoph Lehner
|
6365a89ba3
|
create separate InitMessage for MemoryManager that can be called after communicator setup
|
2020-07-30 07:25:05 -04:00 |
|
Christoph Lehner
|
197612bc7a
|
fast cpu basisRotate and other small cleanups
|
2020-07-30 07:08:54 -04:00 |
|
Christoph Lehner
|
0e88bf4bff
|
remove Nils's default pragma
|
2020-07-29 10:24:35 -04:00 |
|
Christoph Lehner
|
3e64d78469
|
include versions.h again and add back asserts in Test_simd
|
2020-07-29 10:18:05 -04:00 |
|
nmeyer-ur
|
ea7f8fda5e
|
fix typo
|
2020-07-22 09:34:05 +02:00 |
|
nmeyer-ur
|
906b78811b
|
exit in Init when using --comms-overlap
|
2020-07-22 08:57:01 +02:00 |
|
nmeyer-ur
|
d9474c6cb6
|
compiler-independent build using --enable-simd=A64FX
|
2020-07-09 10:07:02 +02:00 |
|
nmeyer-ur
|
bbd145382b
|
enable --enable-simd=A64FX in configure
|
2020-07-08 12:43:51 +02:00 |
|
nmeyer-ur
|
1b08cb7300
|
Merge branch 'develop' into feature/a64fx-2
|
2020-07-08 08:18:18 +02:00 |
|
nmeyer-ur
|
8726e94ea7
|
merge upstream develop
|
2020-07-07 20:26:47 +02:00 |
|
|
f1f655d92b
|
Merge pull request #304 from Heinrich-BR/develop
ScalarImpl.h updates
|
2020-07-06 10:16:03 +01:00 |
|
|
43334e88c3
|
Tiny change in a comment for clarity
|
2020-07-04 16:11:16 +01:00 |
|
|
4f1e66b044
|
Fixed HMC SU(N) integrator which was causing fields to leave Lie Algebra manifold for N>2
|
2020-07-04 03:53:06 +01:00 |
|
nmeyer-ur
|
1635c263ee
|
disable TOFU by default
|
2020-06-30 19:27:08 +02:00 |
|
|
eb470aa6dc
|
Update to baryon and added comments/fix whitespace
|
2020-06-29 09:43:01 +01:00 |
|
|
77af9a3ddc
|
Baryon revert sign
|
2020-06-26 10:08:42 +01:00 |
|
|
102089798c
|
BaryonUtils: update to autoView
|
2020-06-25 16:41:58 +01:00 |
|
|
39cea8b5a7
|
Merge branch 'develop' into feature/baryon
|
2020-06-25 16:24:07 +01:00 |
|
|
a65f66d2db
|
Merge branch 'feature/baryon3pt' into feature/baryon
|
2020-06-25 16:20:59 +01:00 |
|
Peter Boyle
|
936c5ecf69
|
Reduction GPU no compile fix
|
2020-06-24 17:28:31 -04:00 |
|
Peter Boyle
|
22cfbdbbb3
|
Boost precision in inner products in single
|
2020-06-24 12:52:31 -04:00 |
|
Peter Boyle
|
093d1ee21b
|
Force initial values
|
2020-06-24 08:54:49 -04:00 |
|
Peter Boyle
|
d6ba2581ce
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2020-06-24 08:25:08 -04:00 |
|
Peter Boyle
|
577c064184
|
Memory manager initialise earlier
|
2020-06-24 08:24:38 -04:00 |
|
Peter Boyle
|
2ff1fa6fad
|
UVM used shared for CPU alloccations andd ddont migrate
|
2020-06-23 22:14:56 -04:00 |
|
|
4ef50ba31f
|
Baryon speedup
|
2020-06-23 11:44:20 +01:00 |
|
|
3e97a26f90
|
BaryonGamm3pt threads -> accelerator
|
2020-06-23 11:35:32 +01:00 |
|
|
599f28f6ef
|
Baryon bug fixes
|
2020-06-23 11:10:26 +01:00 |
|
Peter Boyle
|
c48da35921
|
Memory Vector UVM and Lattice alignedAllocator separate
|
2020-06-22 20:21:53 -04:00 |
|