Peter Boyle
9dad7a0094
Reproducible reduction and axpy_norm offload from Gianluca.
...
Hopefully get CG running entirely on GPU
2019-07-30 00:14:12 +01:00
Peter Boyle
8c6016f717
Merge pull request #219 from mmphys/feature/include
...
Housekeeping. #include <Grid.h> ---> #include <Grid/Grid.h>
2019-07-29 23:08:01 +01:00
Peter Boyle
1282e1067f
Do the force term on the accelerator too. Needed particularly because comms buffers
...
are device memory.
2019-07-29 22:58:35 +01:00
Peter Boyle
275c1c920f
More info dump on error from CUDA
2019-07-26 12:18:53 +01:00
Peter Boyle
fe700a183a
Getting HMC to run
2019-07-26 12:18:29 +01:00
Peter Boyle
34108296cd
Merge branch 'develop' into feature/gpu-port
...
Conflicts:
Grid/simd/Grid_avx512.h
2019-07-20 17:05:35 +01:00
Peter Boyle
76c704b84b
Intrinsics for CLANG are now fixed in v6
2019-07-20 16:52:24 +01:00
Peter Boyle
ce255ec359
Relocate to fix build failure for comms none
2019-07-20 16:37:03 +01:00
Peter Boyle
1c096626cb
Hypercube defaults to on if HPE detected, but override to off possible
2019-07-20 16:06:16 +01:00
Peter Boyle
25ba4c5f80
Merge branch 'develop' into feature/gpu-port
...
Conflicts:
HMC/Mobius2p1fEOFA.cc
tests/forces/Test_rect_force.cc
2019-07-19 11:01:55 +01:00
Peter Boyle
9e926e3fc5
Build fix in develop
2019-07-19 10:01:52 +01:00
Peter Boyle
775eaee199
Fix for suspected Intel 2018.1 compiler bug under O3
2019-07-19 07:57:34 +01:00
Felix Erben
56cefadf9b
gamma matrices as input
2019-07-18 17:46:43 +01:00
Peter Boyle
331f5a53dc
New header
2019-07-18 14:51:09 +01:00
Peter Boyle
a23dc295ac
Remove compiler errors and warnings
2019-07-18 14:47:02 +01:00
ferben
11a8668d19
bugfix in Baryonutils
2019-07-18 14:44:55 +01:00
ferben
cded7670d0
new utils for baryons
2019-07-18 14:29:04 +01:00
ferben
feb029fb66
new utils for baryons
2019-07-18 14:24:16 +01:00
Peter Boyle
08904f830e
Merge develop
2019-07-16 11:59:56 +01:00
Peter Boyle
fa9cd50c5b
Merge branch 'develop' into feature/gpu-port
2019-07-16 11:55:17 +01:00
Peter Boyle
42c1dbb1d1
General local stencil first cut for Patrick force term
2019-07-14 14:04:28 +01:00
Peter Boyle
6179acfda0
Put back a call that was required
2019-07-14 13:59:54 +01:00
Peter Boyle
07601ac1f5
Replace instantiation of Gparity
2019-07-12 17:18:12 +01:00
Peter Boyle
705a8098b2
Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
...
Conflicts:
Grid/stencil/Stencil.h
2019-07-12 17:14:11 +01:00
Peter Boyle
a29b43d755
Stencil comms cleaner
2019-07-12 17:12:25 +01:00
Peter Boyle
368c8369ce
Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
2019-07-12 17:11:29 +01:00
Peter Boyle
78ebd93281
Cuda 9.1 happy
2019-07-12 17:11:00 +01:00
Peter Boyle
3d58daf70f
Safety check
2019-07-12 17:10:35 +01:00
Peter Boyle
bd155ca5c0
Overlap comms with comput now supported
2019-07-12 09:09:40 +01:00
Peter Boyle
91e2cf9b40
All axes can be used for comms now
2019-07-12 09:08:26 +01:00
Peter Boyle
3cc9947731
Better welcome printing
2019-07-12 06:47:51 +01:00
Peter Boyle
f15eeb0283
localise scope of variables declared in macro
2019-07-12 06:47:01 +01:00
Peter Boyle
0996ba9396
Pretty messaging
2019-07-12 06:45:31 +01:00
Peter Boyle
44170cc15f
Initialise CUDA device prior to entering MPI.
...
This may or may not interact with Summit which configures MPI - CUDA mapping with jsrun.
TBD
Cases of OpenMPI and MVAPICH are covered, and default to cudaSetDevice(0) otherwise
2019-07-11 03:14:23 +01:00
Felix Erben
b7d0cf6751
buxfix in diquark sum / baryons
2019-07-04 22:06:37 +01:00
Peter Boyle
6e3c3214a3
Offload loops
2019-07-02 17:25:40 +01:00
Peter Boyle
d6ffadb33b
Coalesced write
2019-07-02 17:25:13 +01:00
Peter Boyle
b8f7bfbb26
Dont stream as poor perf in some cases
2019-07-01 07:30:25 +01:00
Peter Boyle
7b7c470917
Accelerator loop
2019-07-01 07:29:51 +01:00
Peter Boyle
532e226b22
cuda 9.1 fixes
2019-07-01 07:29:22 +01:00
Peter Boyle
6a13731818
Move GPU cuda call earlier
2019-07-01 07:28:41 +01:00
fionnoh
67690df3bd
Changes nedded to have a current insertion on every second time slice - avoids unnecessary contractions
2019-06-28 15:18:28 +08:00
fionnoh
421a0a8a36
Changes to A2Autils, A2AMatirx and DiskVector code that is needed for Hadrons 4 quark contraction module
2019-06-27 13:45:20 +08:00
Peter Boyle
1cd4ee0706
Thrust used on GPU builds
2019-06-18 12:50:35 +01:00
Peter Boyle
703dc20377
Compile tests fix
2019-06-16 13:59:29 +01:00
Peter Boyle
d976e5c514
Pow is being awkward in thrust for reasons I don't understand. Possible thrust bug.
2019-06-16 12:05:11 +01:00
Peter Boyle
d7b3efe893
Compile fix
2019-06-15 17:03:15 +01:00
Peter Boyle
0184719216
Change to predicate type
2019-06-15 12:52:26 +01:00
Peter Boyle
24202dbc51
Thread loop construct change
2019-06-15 12:52:07 +01:00
Peter Boyle
d763c303c5
Clean acceleerator barrier
2019-06-15 12:51:45 +01:00