Peter Boyle
|
5720ced0fd
|
Simplifying
|
2019-06-04 21:30:08 +01:00 |
|
Peter Boyle
|
2c87b56b53
|
Making GPU happier
|
2019-06-04 21:29:44 +01:00 |
|
Peter Boyle
|
dbad48d802
|
Remove Ls vectorised DWF
|
2019-06-04 21:27:40 +01:00 |
|
Peter Boyle
|
4557a1365a
|
Remove Ls vectorised DWF
|
2019-06-04 20:59:59 +01:00 |
|
Peter Boyle
|
16e9b87d98
|
Remove Ls vectorised DWF as unused and hard to maintain
|
2019-06-04 20:59:01 +01:00 |
|
Peter Boyle
|
685eea3d0f
|
Small cosmetic
|
2019-06-04 20:58:14 +01:00 |
|
Peter Boyle
|
65b48831fb
|
Simplify code
|
2019-06-04 20:56:30 +01:00 |
|
Peter Boyle
|
57396fc595
|
Simplify code
|
2019-06-04 20:56:23 +01:00 |
|
Peter Boyle
|
a2e199df50
|
Simplifying Cayley cases.
|
2019-06-04 20:54:52 +01:00 |
|
Peter Boyle
|
020346c848
|
WOrk list. Will have to clean up Fermion sector.
|
2019-06-04 20:54:00 +01:00 |
|
Peter Boyle
|
c2625a127e
|
Non blocking loop. Want to change the naming here.
|
2019-06-04 20:52:59 +01:00 |
|
Peter Boyle
|
8794d35c78
|
GPU
|
2019-06-04 20:52:27 +01:00 |
|
Peter Boyle
|
24bff6dbe6
|
Minor improvements
|
2019-06-04 20:51:48 +01:00 |
|
Peter Boyle
|
45b15d10d3
|
GPU happy changes
|
2019-06-04 20:49:16 +01:00 |
|
Peter Boyle
|
33d6bbe32b
|
GPU must use accelerator vectors
|
2019-06-04 20:48:52 +01:00 |
|
Peter Boyle
|
7a1569bd46
|
Annoying, cannot rely on equivalence of Grid ComplexD adn Eigen Complex type on GPU.
Solve with ComplexD typecasts but must be a better way
|
2019-06-04 20:47:49 +01:00 |
|
Peter Boyle
|
6e2e904a0e
|
NVCC compiles happy. Start to develop strategy for writing generic
code for GPU kernels and CPU kernels.
|
2019-06-04 20:46:35 +01:00 |
|
Peter Boyle
|
d92a17f359
|
Suppress NVCC warnings in pugixml with pragma
|
2019-06-04 20:45:53 +01:00 |
|
Peter Boyle
|
47c063f984
|
Remove Ls Vec cases from benchmarks
|
2019-06-04 20:45:35 +01:00 |
|
Peter Boyle
|
7e27a5213a
|
Tests builds clean.
|
2019-06-04 20:45:20 +01:00 |
|
Peter Boyle
|
ade4a126da
|
Getting closer on the GPU port, but will start deleting 5th dim vectorised variants
for code maintainability
|
2019-06-04 11:53:44 +01:00 |
|
Peter Boyle
|
7b59ab5bd7
|
Compiling after reorganisation
|
2019-06-03 15:46:26 +01:00 |
|
Peter Boyle
|
fcd8cfe257
|
Gparity in
|
2019-06-03 15:45:09 +01:00 |
|
Peter Boyle
|
b4b53812cb
|
Move implementation to specific implementation headers
|
2019-06-03 15:43:01 +01:00 |
|
Peter Boyle
|
085cac583f
|
Implementation in header
|
2019-06-03 15:42:36 +01:00 |
|
Peter Boyle
|
25e3b8640c
|
Move to header
|
2019-06-03 15:42:05 +01:00 |
|
Peter Boyle
|
44bbec50b0
|
Making GPU compile happy
|
2019-06-03 14:57:04 +01:00 |
|
Peter Boyle
|
ec68b67d5d
|
Attempt at unified GPU and CPU kernel
|
2019-06-03 14:55:51 +01:00 |
|
Peter Boyle
|
778450e0c8
|
Move to implementation subdir
|
2019-06-03 14:53:56 +01:00 |
|
Peter Boyle
|
567aa5f366
|
Move to implementation subdir
|
2019-06-03 14:53:33 +01:00 |
|
Peter Boyle
|
2ab7e2b175
|
Force instantiation in .cc files.
Eventually move into multiple files
|
2019-06-03 14:52:59 +01:00 |
|
Peter Boyle
|
6f61be044d
|
Dont instantiate in header
|
2019-06-03 14:52:01 +01:00 |
|
Peter Boyle
|
269e00509e
|
Don't instantiate in header
|
2019-06-03 14:51:24 +01:00 |
|
Peter Boyle
|
a5e90b0ddc
|
Making the kernels more GPU happy
|
2019-06-03 14:50:54 +01:00 |
|
Peter Boyle
|
5622faf226
|
pragma once ifdef guard
|
2019-06-03 14:50:26 +01:00 |
|
Peter Boyle
|
82ecd520c7
|
Macos happy fix under nvcc
|
2019-06-03 14:48:50 +01:00 |
|
Peter Boyle
|
ffde81f22a
|
Nsimd() and coalesced support
|
2019-05-25 12:44:07 +01:00 |
|
Peter Boyle
|
d8098f1ecd
|
coalesced support
|
2019-05-25 12:43:31 +01:00 |
|
Peter Boyle
|
aca788cf4f
|
Move coalesced read into tensors
|
2019-05-25 12:43:00 +01:00 |
|
Peter Boyle
|
a0e9f3b0a0
|
Plan for GPU port
|
2019-05-20 09:46:19 +01:00 |
|
Peter Boyle
|
a9342c6ae5
|
Udpdate TODO afer gianluc marge
|
2019-05-18 22:58:25 +01:00 |
|
Peter Boyle
|
ee6f96d85c
|
Merge pull request #210 from grid-test-organisation/feature/gpu-port-develop
Cayley fermion functions for GPUs
|
2019-05-18 19:06:20 +01:00 |
|
Peter Boyle
|
4e9df9e93c
|
GPU patches
|
2019-05-18 17:43:11 +01:00 |
|
Peter Boyle
|
9fe68857a9
|
Runs multiGPU with coalesced access on tesseract
|
2019-05-18 17:42:41 +01:00 |
|
Peter Boyle
|
37336c9e0c
|
Allow compress to be either vector or scalar types
|
2019-05-18 17:41:13 +01:00 |
|
Peter Boyle
|
6c4da3bbc7
|
Stencil now runs with coalesced accesses
|
2019-05-18 17:40:35 +01:00 |
|
Peter Boyle
|
a584b16c4a
|
Adding a non-blocking kernel launch
|
2019-05-18 17:39:54 +01:00 |
|
gfilaci
|
1a82533d22
|
fix inner product with thrust reduction
|
2019-05-14 15:35:54 +01:00 |
|
gfilaci
|
e3c56fd9b3
|
CayleyZeroCounters before benchmark loop
|
2019-05-13 15:52:00 +01:00 |
|
gfilaci
|
955cc7790f
|
MooeeInvDag offloaded to GPU
|
2019-05-13 14:25:29 +01:00 |
|