Peter Boyle
b473405652
Tensor ambiguous fix
2019-08-29 09:36:41 -05:00
Peter Boyle
9b7a6d197f
Fix for GCC preprocessor/pragma handling bug
2019-08-23 14:37:46 +01:00
Peter Boyle
28d6be2a4e
Fix GCC complaint
2019-08-22 18:56:37 +01:00
Peter Boyle
be37dfb6f8
Remove debug code
2019-08-15 01:31:40 +01:00
Peter Boyle
e279b2be29
Merge develop
2019-08-14 23:01:59 +01:00
Peter Boyle
48e6efc7c9
Merge branch 'develop' into feature/gpu-port
...
Conflicts:
Grid/qcd/action/fermion/WilsonKernelsAsm.cc
Grid/qcd/action/fermion/implementation/ImprovedStaggeredFermionImplementation.h
Grid/qcd/action/fermion/implementation/StaggeredKernelsAsm.h
benchmarks/Benchmark_comms.cc
2019-08-14 18:56:54 +01:00
Peter Boyle
3e49dc8a67
Reduction finished and hopefully fixes CI regression fail on single precisoin and force
2019-08-14 15:18:34 +01:00
Peter Boyle
96ac56cace
Double precision variants for summation accuracy
2019-08-14 13:08:01 +01:00
Peter Boyle
ce97638bac
Think the reduction is now sorted and cleaned up
2019-08-11 11:09:01 +01:00
Peter Boyle
53e3ab4131
Fix force term
2019-08-11 11:06:13 +01:00
Peter Boyle
9cd33a7b9c
Printing improvement
2019-07-31 08:01:24 +01:00
Peter Boyle
639dc1ab21
GPU reduction fix and also exit backtrace option
2019-07-31 01:23:23 +01:00
Peter Boyle
9117f61109
GPU friendly
2019-07-31 01:22:54 +01:00
Peter Boyle
9dad7a0094
Reproducible reduction and axpy_norm offload from Gianluca.
...
Hopefully get CG running entirely on GPU
2019-07-30 00:14:12 +01:00
Peter Boyle
8c6016f717
Merge pull request #219 from mmphys/feature/include
...
Housekeeping. #include <Grid.h> ---> #include <Grid/Grid.h>
2019-07-29 23:08:01 +01:00
Peter Boyle
1282e1067f
Do the force term on the accelerator too. Needed particularly because comms buffers
...
are device memory.
2019-07-29 22:58:35 +01:00
Peter Boyle
275c1c920f
More info dump on error from CUDA
2019-07-26 12:18:53 +01:00
Peter Boyle
fe700a183a
Getting HMC to run
2019-07-26 12:18:29 +01:00
Peter Boyle
34108296cd
Merge branch 'develop' into feature/gpu-port
...
Conflicts:
Grid/simd/Grid_avx512.h
2019-07-20 17:05:35 +01:00
Peter Boyle
76c704b84b
Intrinsics for CLANG are now fixed in v6
2019-07-20 16:52:24 +01:00
Peter Boyle
ce255ec359
Relocate to fix build failure for comms none
2019-07-20 16:37:03 +01:00
Peter Boyle
1c096626cb
Hypercube defaults to on if HPE detected, but override to off possible
2019-07-20 16:06:16 +01:00
Peter Boyle
25ba4c5f80
Merge branch 'develop' into feature/gpu-port
...
Conflicts:
HMC/Mobius2p1fEOFA.cc
tests/forces/Test_rect_force.cc
2019-07-19 11:01:55 +01:00
Peter Boyle
9e926e3fc5
Build fix in develop
2019-07-19 10:01:52 +01:00
Peter Boyle
775eaee199
Fix for suspected Intel 2018.1 compiler bug under O3
2019-07-19 07:57:34 +01:00
Peter Boyle
331f5a53dc
New header
2019-07-18 14:51:09 +01:00
Peter Boyle
a23dc295ac
Remove compiler errors and warnings
2019-07-18 14:47:02 +01:00
Peter Boyle
08904f830e
Merge develop
2019-07-16 11:59:56 +01:00
Peter Boyle
fa9cd50c5b
Merge branch 'develop' into feature/gpu-port
2019-07-16 11:55:17 +01:00
Peter Boyle
42c1dbb1d1
General local stencil first cut for Patrick force term
2019-07-14 14:04:28 +01:00
Peter Boyle
6179acfda0
Put back a call that was required
2019-07-14 13:59:54 +01:00
Peter Boyle
07601ac1f5
Replace instantiation of Gparity
2019-07-12 17:18:12 +01:00
Peter Boyle
705a8098b2
Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
...
Conflicts:
Grid/stencil/Stencil.h
2019-07-12 17:14:11 +01:00
Peter Boyle
a29b43d755
Stencil comms cleaner
2019-07-12 17:12:25 +01:00
Peter Boyle
368c8369ce
Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
2019-07-12 17:11:29 +01:00
Peter Boyle
78ebd93281
Cuda 9.1 happy
2019-07-12 17:11:00 +01:00
Peter Boyle
3d58daf70f
Safety check
2019-07-12 17:10:35 +01:00
Peter Boyle
bd155ca5c0
Overlap comms with comput now supported
2019-07-12 09:09:40 +01:00
Peter Boyle
91e2cf9b40
All axes can be used for comms now
2019-07-12 09:08:26 +01:00
Peter Boyle
3cc9947731
Better welcome printing
2019-07-12 06:47:51 +01:00
Peter Boyle
f15eeb0283
localise scope of variables declared in macro
2019-07-12 06:47:01 +01:00
Peter Boyle
0996ba9396
Pretty messaging
2019-07-12 06:45:31 +01:00
Peter Boyle
44170cc15f
Initialise CUDA device prior to entering MPI.
...
This may or may not interact with Summit which configures MPI - CUDA mapping with jsrun.
TBD
Cases of OpenMPI and MVAPICH are covered, and default to cudaSetDevice(0) otherwise
2019-07-11 03:14:23 +01:00
Peter Boyle
6e3c3214a3
Offload loops
2019-07-02 17:25:40 +01:00
Peter Boyle
d6ffadb33b
Coalesced write
2019-07-02 17:25:13 +01:00
Peter Boyle
b8f7bfbb26
Dont stream as poor perf in some cases
2019-07-01 07:30:25 +01:00
Peter Boyle
7b7c470917
Accelerator loop
2019-07-01 07:29:51 +01:00
Peter Boyle
532e226b22
cuda 9.1 fixes
2019-07-01 07:29:22 +01:00
Peter Boyle
6a13731818
Move GPU cuda call earlier
2019-07-01 07:28:41 +01:00
Peter Boyle
1cd4ee0706
Thrust used on GPU builds
2019-06-18 12:50:35 +01:00