paboyle
|
6c97a6a071
|
Coalescing version of the kernel
|
2018-06-13 20:52:29 +01:00 |
|
paboyle
|
73bb2d5128
|
Ugly hack to speed up compile on GPU; we don't use the hand kernels on GPU anyway so why compile
|
2018-06-13 20:35:28 +01:00 |
|
paboyle
|
b710fec6ea
|
Gpu code first version of specialised kernel
|
2018-06-13 20:34:39 +01:00 |
|
paboyle
|
b2a8cd60f5
|
Doubled gauge field is useful
|
2018-06-13 20:27:47 +01:00 |
|
paboyle
|
867ee364ab
|
Explicit instantiation hooks
|
2018-06-13 20:27:12 +01:00 |
|
Peter Boyle
|
eb7d34a4cc
|
GPU version
|
2018-05-14 19:41:47 -04:00 |
|
Peter Boyle
|
aab27a655a
|
Start of GPU kernels
|
2018-05-14 19:41:17 -04:00 |
|
Peter Boyle
|
13f50406e3
|
Suppress print statement
|
2018-05-12 18:00:00 -04:00 |
|
Peter Boyle
|
b15db11c60
|
Kernels -> pure static object to enable device execution
|
2018-03-24 19:35:20 -04:00 |
|
Peter Boyle
|
f6077f9d48
|
Kernels -> not instantiaed otherwise object ref on GPU
|
2018-03-24 19:33:44 -04:00 |
|
Peter Boyle
|
572954ef12
|
Kernels not an instantiated object, just static
|
2018-03-24 19:33:13 -04:00 |
|
Peter Boyle
|
cedeaae7db
|
Lebesge -> StencilView if necessary
|
2018-03-24 19:32:41 -04:00 |
|
Peter Boyle
|
e6cf0b1e17
|
View typedefs go to OperatorImpl
|
2018-03-24 19:32:11 -04:00 |
|
Peter Boyle
|
1f70cedbab
|
Have to make all kernel called routines static since object reference will be a host pointer on GPU
|
2018-03-24 19:29:26 -04:00 |
|
Peter Boyle
|
b50f37cfb4
|
Remove overlap comms flag
|
2018-03-24 19:28:53 -04:00 |
|
Peter Boyle
|
4e1272fabf
|
Kernels need to be static to work on GPU. No reference to host resident data
|
2018-03-22 18:44:53 -04:00 |
|
Peter Boyle
|
607dc2d3c6
|
Remove lebesgue order
|
2018-03-22 18:23:09 -04:00 |
|
Peter Boyle
|
23c880b009
|
Remove lebesgue order; stick in stencil if need
|
2018-03-22 18:13:41 -04:00 |
|
Peter Boyle
|
334bb6792f
|
Lebesgue order removed. Stick in the stencil view
|
2018-03-22 18:12:12 -04:00 |
|
Peter Boyle
|
8a1d303ab9
|
GPU friendly stencil improvements
|
2018-03-19 07:11:03 -04:00 |
|
Peter Boyle
|
bf0a4de919
|
GPU friendly params object
|
2018-03-19 07:10:12 -04:00 |
|
paboyle
|
4d60b92b7f
|
Update oSites
|
2018-03-08 21:00:25 +00:00 |
|
paboyle
|
c159c70c84
|
View introduced
|
2018-03-08 14:58:04 +00:00 |
|
paboyle
|
28b5572755
|
Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port
|
2018-03-08 13:01:42 +00:00 |
|
Peter Boyle
|
4548523ecc
|
This modification eliminates what looks like a compiler bug
on Intel 2017.
|
2018-03-08 04:41:16 -08:00 |
|
paboyle
|
4e3458516a
|
Reverting after fixing issue with extract merge
|
2018-03-07 16:50:13 +00:00 |
|
paboyle
|
40699221e2
|
Dont alias lhs and rhs in a where statement
|
2018-03-06 04:14:13 -08:00 |
|
paboyle
|
e199ba7e88
|
Fix the Charge conjugate BC's
|
2018-03-05 13:59:02 +00:00 |
|
paboyle
|
44188a5c6f
|
AVX512 fix
|
2018-03-05 00:32:24 +00:00 |
|
paboyle
|
3277bda130
|
View introduction to prepare for accelerator offload.
Probably same problem exists for stencil object
|
2018-03-04 16:38:08 +00:00 |
|
paboyle
|
442b0b406c
|
View related changes
|
2018-03-04 16:34:14 +00:00 |
|
paboyle
|
8824a54269
|
View related changes
|
2018-03-04 16:33:33 +00:00 |
|
paboyle
|
078901278c
|
Coordinate handling gpu friendly
|
2018-02-24 22:22:02 +00:00 |
|
paboyle
|
aa6de818e2
|
Copy data needed by Kernels out of the grid object to avoid host reference
|
2018-02-02 11:36:11 +00:00 |
|
paboyle
|
dcf6517a93
|
Accelerator offload and copy Opt into the kernel for GPU host var safety
|
2018-02-02 11:35:35 +00:00 |
|
paboyle
|
a308dff410
|
accelerator loop, copy Opt into the GPU
|
2018-02-02 11:34:37 +00:00 |
|
paboyle
|
14ba20898a
|
Accelerator loop the key kernel call
|
2018-02-02 11:30:07 +00:00 |
|
paboyle
|
a53d3ee19a
|
Add Opt to the lambda capture to get it into the GPU
|
2018-02-02 11:28:39 +00:00 |
|
paboyle
|
e4df025d01
|
Accelerator related
|
2018-02-01 23:20:05 +00:00 |
|
paboyle
|
cfeda9d536
|
constexpr on const ints
|
2018-02-01 22:59:12 +00:00 |
|
paboyle
|
8ae77d3706
|
Small simplification of FermionOperatorImpl towards GPU but not there yet
|
2018-02-01 22:41:54 +00:00 |
|
paboyle
|
70e276e1ab
|
parallel_for elimination -> thread_loop
|
2018-01-28 01:01:14 +00:00 |
|
paboyle
|
2d0bcc2606
|
Zero changes, acceleartor on kernels and some thread loop changes
|
2018-01-27 23:47:38 +00:00 |
|
paboyle
|
c4f82e072b
|
_grid becomes private ; use Grid()§
|
2018-01-27 00:04:12 +00:00 |
|
paboyle
|
2b4067bb71
|
Hide internal data
|
2018-01-26 23:05:32 +00:00 |
|
paboyle
|
85771e97e9
|
Hide internal data
|
2018-01-26 23:04:46 +00:00 |
|
paboyle
|
87ee592176
|
Pragma changes and layout and warning elimination for nvcc
|
2018-01-24 13:14:09 +00:00 |
|
paboyle
|
d74c21a386
|
GLobal edit for QCD namespace removal & NAMESPACE macros
|
2018-01-15 09:37:58 +00:00 |
|
paboyle
|
eda4fd9912
|
Namespace
|
2018-01-14 23:49:11 +00:00 |
|
paboyle
|
041d9137c0
|
Namespace
|
2018-01-14 23:48:27 +00:00 |
|