1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-25 10:09:34 +01:00
Commit Graph

3804 Commits

Author SHA1 Message Date
paboyle
db988301d0 Introduce view objects for indexing lattices. Used to pass the view to acccelerators 2018-03-04 15:55:16 +00:00
paboyle
9b1f29c4c2 Support a view for passing to accelerator 2018-03-04 15:54:35 +00:00
paboyle
e5ea04ee0c Need to support precision change, and real replication in multiple simd lanes 2018-03-04 15:53:04 +00:00
paboyle
c92a3c6068 Need to support any vector type template and run on accelerator 2018-03-04 15:52:14 +00:00
paboyle
03f8da8fbc enable-debug option for debug flags in compile 2018-03-04 15:51:47 +00:00
paboyle
78a9e31ff0 options more obvious 2018-02-24 22:26:32 +00:00
paboyle
c1fc947bb8 Coordinate handling GPU friendly + some GPU merge/extract improvements 2018-02-24 22:26:10 +00:00
paboyle
ff7b19a71b Coordinate handling GPU ready avoid malloc 2018-02-24 22:25:39 +00:00
paboyle
1c16ffa1c1 Coordinate GPU ready. No malloc 2018-02-24 22:25:09 +00:00
paboyle
4962f59477 Eliminate both GPU issue and threading bottle neck by avoiding malloc in coordinate handling 2018-02-24 22:24:37 +00:00
paboyle
e158b60bce GPU friendly coords 2018-02-24 22:23:47 +00:00
paboyle
34820bec27 Coordinate handling GPU ready. No malloc 2018-02-24 22:23:18 +00:00
paboyle
eed9aa9f0c Extract merge gpu ready 2018-02-24 22:23:01 +00:00
paboyle
8792ff6439 Coordinate handling gpu ready 2018-02-24 22:22:43 +00:00
paboyle
078901278c Coordinate handling gpu friendly 2018-02-24 22:22:02 +00:00
paboyle
bf5fb89aff Coordinate handling GPU friendly 2018-02-24 22:21:36 +00:00
paboyle
7574c18cef Massive clean up extract merge.
Simpler and GPU friendly
2018-02-24 22:21:08 +00:00
paboyle
36ea5f6b77 gpu friendly coordinates ; no std::vector on GPU 2018-02-24 22:20:14 +00:00
paboyle
285deab432 Coordinate handling GPU friendly. Avoid std::vector 2018-02-24 22:19:28 +00:00
paboyle
bb7d87d0a0 Coordinate handling gpu friendly 2018-02-24 22:18:33 +00:00
paboyle
b9b5bdfc3a Proper offload (accelerator access) will require a mutable copy lambda. 2018-02-02 11:38:19 +00:00
paboyle
51eb2c5dfc Make referencign the stencil and all info required to evaluate the kernel
accelerator marked up
2018-02-02 11:37:13 +00:00
paboyle
ede0dff794 Mark up as an accelerator function 2018-02-02 11:36:44 +00:00
paboyle
aa6de818e2 Copy data needed by Kernels out of the grid object to avoid host reference 2018-02-02 11:36:11 +00:00
paboyle
dcf6517a93 Accelerator offload and copy Opt into the kernel for GPU host var safety 2018-02-02 11:35:35 +00:00
paboyle
a308dff410 accelerator loop, copy Opt into the GPU 2018-02-02 11:34:37 +00:00
paboyle
14ba20898a Accelerator loop the key kernel call 2018-02-02 11:30:07 +00:00
paboyle
a53d3ee19a Add Opt to the lambda capture to get it into the GPU 2018-02-02 11:28:39 +00:00
paboyle
5df435319d Use constexpr 2018-02-02 11:27:56 +00:00
paboyle
0da2d3e222 accelerator off load some more stuff 2018-02-02 11:27:35 +00:00
paboyle
9c9dfbfa78 Force accelerator 2018-02-02 11:25:09 +00:00
paboyle
e4df025d01 Accelerator related 2018-02-01 23:20:05 +00:00
paboyle
cfeda9d536 constexpr on const ints 2018-02-01 22:59:12 +00:00
paboyle
4450b1993a Offload 2018-02-01 22:45:47 +00:00
paboyle
d03ce5c2a4 Provide a way to get around std::vector for a known type on device.
Use template specialisation to access a private member in the Clang++ STL implementation
2018-02-01 22:44:25 +00:00
paboyle
7d6522c1ef Accelerator inline 2018-02-01 22:43:56 +00:00
paboyle
b96832a922 Accelerator inline 2018-02-01 22:43:26 +00:00
paboyle
5d7af47b05 accelerator_inline 2018-02-01 22:42:54 +00:00
paboyle
053ef25c90 constexpr makes GPU happy 2018-02-01 22:42:29 +00:00
paboyle
8ae77d3706 Small simplification of FermionOperatorImpl towards GPU but not there yet 2018-02-01 22:41:54 +00:00
paboyle
79b50feacf fixme updates 2018-01-29 16:00:40 +00:00
paboyle
c67c1544cd abs no compile on travis fix attempt 2018-01-28 10:26:04 +00:00
paboyle
e657f9a344 OMP collapse changes to make NVCC happy 2018-01-28 01:21:53 +00:00
paboyle
b6ebf35af5 Intel compiler doesn't like Nvidia error disable pragmas 2018-01-28 01:03:10 +00:00
paboyle
604c05f4b8 parallel_for elimination -> thread_loop 2018-01-28 01:01:36 +00:00
paboyle
70e276e1ab parallel_for elimination -> thread_loop 2018-01-28 01:01:14 +00:00
paboyle
9472b02771 Parallel_for elimination -> thread_loop. 2018-01-28 01:00:55 +00:00
paboyle
9597ab94eb Zero changes, swap on lattice type. 2018-01-27 23:51:40 +00:00
paboyle
ce4da83bc2 Zero changes, literally 2018-01-27 23:51:10 +00:00
paboyle
d557f3ef77 Zero changes (literally) and also a warning elimination 2018-01-27 23:50:43 +00:00