paboyle
408b868475
Generic for GPU needs accelerator markup of functions
2018-01-24 13:49:12 +00:00
paboyle
1c797deb04
Accelerator tweaks
2018-01-24 13:43:43 +00:00
paboyle
b9d5a42b57
Should be able to eliminate the COMMA_SAFE with VA_ARGS trick ; revisit this file
2018-01-24 13:42:06 +00:00
paboyle
e737591918
Accelerator loops
2018-01-24 13:41:12 +00:00
paboyle
ba5ea5830b
Acceleartor loops
2018-01-24 13:40:56 +00:00
paboyle
43f244badf
Thread loops for now; figure out what can be GPU accelerated later here
2018-01-24 13:40:30 +00:00
paboyle
e9c8ba5ef7
Accelerator loosp
2018-01-24 13:39:54 +00:00
paboyle
d70709a8e8
Thread construct changes
2018-01-24 13:39:06 +00:00
paboyle
733f8ff0b2
Still using parallel_for -- don't know how to implement reduction on GPU yet. Look at some sample code is best.
2018-01-24 13:38:13 +00:00
paboyle
0bfa5bb213
Accelerator loosp
2018-01-24 13:37:26 +00:00
paboyle
1f26a234f9
CPU loops explicit for peek poke
2018-01-24 13:36:31 +00:00
paboyle
13f0116425
Accelerator loops
2018-01-24 13:35:55 +00:00
paboyle
25f589b064
Accelerator loops
2018-01-24 13:35:36 +00:00
paboyle
210c50a278
Accelerator prep work
2018-01-24 13:35:13 +00:00
paboyle
549a143e78
Accelerator related
2018-01-24 13:34:46 +00:00
paboyle
277301486d
Simple warning elimination
2018-01-24 13:34:15 +00:00
paboyle
c851b39a49
Nicer way of including aggregate
2018-01-24 13:33:34 +00:00
paboyle
15cc12eb6c
Delete the old non ET file
2018-01-24 13:33:07 +00:00
paboyle
ae4f1f8c12
New file, split out two from Lattice_reduction
2018-01-24 13:32:43 +00:00
paboyle
5609624b44
Threading constructs replaced
2018-01-24 13:32:24 +00:00
paboyle
b5a947dd79
Change to make NVCC happy
2018-01-24 13:32:02 +00:00
paboyle
ee16f62322
stray semicolon elimination. NVCC is picky, but eventually picked up these diags
...
with a pragma to suppress
2018-01-24 13:31:17 +00:00
paboyle
3318de27d6
Thread macro changes
2018-01-24 13:30:23 +00:00
paboyle
ac56965306
GPU changes and threading macros replaced
2018-01-24 13:28:30 +00:00
paboyle
8e99264f40
Accelerator mark up of entire tensore space for offload
2018-01-24 13:27:30 +00:00
paboyle
69327db9a9
Improviements for NVCC. Eigen is not compat with CUDA 9 and must hack to disable device
...
compilation
2018-01-24 13:25:07 +00:00
paboyle
7331ee2d80
Warnings control to overpower the NVCC compiler
2018-01-24 13:24:36 +00:00
paboyle
918c105c57
NVCC warning elimination
2018-01-24 13:23:59 +00:00
paboyle
be1511d469
Remove old macros for threading
2018-01-24 13:23:24 +00:00
paboyle
f1c31df9d2
updated Eigen version. Still didn't fix CUDA 9 no compile.
...
Worked around by switching off __NVCC__ during the include of Eigen and switching it
back on after. No Eigen code can be offloaded, note as a rsult of this. No harm done.
2018-01-24 13:19:29 +00:00
paboyle
ff7b587fad
Ugly... nvcc needs -x cu to compile .cc as cuda.
...
Since CXXFLAGS is Also passed to linker, and -x cu breaks link phase must replace
CXX and CXXLD with nvcc -x cu and nvcc -link respectively.
2018-01-24 13:18:19 +00:00
paboyle
4e1135b214
Updated pugixml to v1.8; still didn't fix no compile under nvcc.
...
Turns out nvcc was right; must to an explicit template instantiation that was missing
but left gcc, icpc and clang happy for some reason.
Fix this.
2018-01-24 13:17:10 +00:00
paboyle
acd4955a18
remove rdtsc on __NVCC__ as may be device called
2018-01-24 13:16:18 +00:00
paboyle
bd08dc4f45
Pragma use for nvcc, warning elimination.
2018-01-24 13:15:43 +00:00
paboyle
22d137d4e5
Namespace, nvcc warning elimination.
2018-01-24 13:14:43 +00:00
paboyle
87ee592176
Pragma changes and layout and warning elimination for nvcc
2018-01-24 13:14:09 +00:00
paboyle
063603b1ea
Warning elimination
2018-01-24 13:12:14 +00:00
paboyle
f292106db6
Split out pragms from threads.h;
...
More work needed; renam threads directory to "parallelism" or something like that
2018-01-24 13:11:04 +00:00
paboyle
9d08aebea9
Compile through nvcc ; warning elimination fixes
2018-01-24 13:09:53 +00:00
paboyle
4e30739093
First compile OK through nvcc on host
2018-01-24 13:08:47 +00:00
a1151fc734
Hadrons: MPI-safe serial IO
2018-01-23 17:26:50 +00:00
James Harrison
ab3baeb38f
Implement contractions and data output in functions; calculate diagrams S, X and 4C separately; output 2E and 2T instead of sunset_shifted, sunset_unshifted, tadpole_shifted, tadpole_unshifted; add comments.
2018-01-23 17:07:45 +00:00
Vera Guelpers
389731d373
changed SeqConservedSummed.hpp to work with new hadrons interface
2018-01-23 10:11:33 +00:00
6e3ce7423e
Hadrons: don't display module list at startup (too long)
2018-01-22 20:04:05 +00:00
15f15a7cfd
Merge branch 'develop' into feature/hadrons
...
# Conflicts:
# extras/Hadrons/Modules.hpp
# extras/Hadrons/modules.inc
2018-01-22 20:03:36 +00:00
0e5f626226
Hadrons: module for scalar operator divergence
2018-01-22 19:38:19 +00:00
Daniel Richtmann
04f92ccddf
WilsonMG: Provide a fix for the previous commit; compiles and runs successfully now
...
I don't like the solution with the temporary very much though ...
2018-01-22 14:56:48 +01:00
Daniel Richtmann
3b2d805398
WilsonMG: Some first steps towards coarse spin dofs; not compiling yet
...
A failing conversion from the innermost type (Grid::Simd<...>) to a coarse
scalar (triple iScalar) in blockPromote prohibits this commit from working.
2018-01-22 12:45:51 +01:00
Azusa Yamaguchi
97b9c6f03d
No option for interior/exterior split of asm kernels since different directions get interleaved
2018-01-22 11:04:19 +00:00
Azusa Yamaguchi
63982819c6
No option to overlap comms and compute for asm implementation since different directions are interleaved
...
in the kernels, introducing if else structure would be too painful
2018-01-22 11:03:39 +00:00