Peter Boyle
9aafd20468
Simple block project promote runs faster on GPU
2019-12-17 05:01:39 -05:00
Peter Boyle
9e15474999
Accelerator loop attempt at speed up
2019-12-14 05:28:16 -05:00
Peter Boyle
152b525a4d
Typo fix
2019-12-13 22:44:42 -05:00
Peter Boyle
d18994eddc
offload more of mgrid to GPU
2019-12-13 22:08:11 -05:00
Chris K
845a045493
Merge pull request #233 from giltirn/lanczos_fix
...
A few run /compile / memory leak fixes
2019-10-30 10:21:59 -04:00
Fionn O hOgain
5de9547db5
Removing old debug code
2019-10-08 15:51:28 +01:00
Christopher Kelly
114ebb7914
Fixed Lanczos calling aligned alloc in threaded region hitting up against pointer-cache no-threading restrictions
...
Fixed Lattice::reset not compiling with new Grid explicit memory region handling
Fixed memory leak in Lattice::resize that occurs when data region has been previously allocated
2019-08-26 16:47:44 -04:00
Peter Boyle
be37dfb6f8
Remove debug code
2019-08-15 01:31:40 +01:00
Peter Boyle
3e49dc8a67
Reduction finished and hopefully fixes CI regression fail on single precisoin and force
2019-08-14 15:18:34 +01:00
Peter Boyle
ce97638bac
Think the reduction is now sorted and cleaned up
2019-08-11 11:09:01 +01:00
Peter Boyle
9117f61109
GPU friendly
2019-07-31 01:22:54 +01:00
Peter Boyle
9dad7a0094
Reproducible reduction and axpy_norm offload from Gianluca.
...
Hopefully get CG running entirely on GPU
2019-07-30 00:14:12 +01:00
Peter Boyle
775eaee199
Fix for suspected Intel 2018.1 compiler bug under O3
2019-07-19 07:57:34 +01:00
Peter Boyle
d976e5c514
Pow is being awkward in thrust for reasons I don't understand. Possible thrust bug.
2019-06-16 12:05:11 +01:00
Peter Boyle
20359ca15f
Coalesced loops.
2019-06-15 08:03:57 +01:00
Peter Boyle
736358b0cb
Coalesced loops
2019-06-15 08:03:13 +01:00
Peter Boyle
6b692aa726
Thread loops
2019-06-15 08:02:26 +01:00
Peter Boyle
7f99e1cd3b
Coalesced loops
2019-06-15 08:01:39 +01:00
Peter Boyle
f3c89df948
Thread loop changes
2019-06-15 08:00:37 +01:00
Peter Boyle
b7e6d111d7
Thread loop changes. Need to offload this file
2019-06-15 07:59:10 +01:00
Peter Boyle
f39cf69c33
Accelerator loop change
2019-06-15 07:58:23 +01:00
Peter Boyle
8e27338df2
Rationalise number of loop macros
2019-06-15 07:57:40 +01:00
Peter Boyle
0ea7f5279d
Accelerator loop changes
2019-06-15 07:56:14 +01:00
Peter Boyle
18e5de426d
There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents.
...
The conserved currents need rewritten using data parallel operations.
2019-06-15 07:53:58 +01:00
Peter Boyle
e896d81235
Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean.
2019-06-15 07:52:44 +01:00
Peter Boyle
7b8ccff4f4
Accelerated coalesced loops in most cases
2019-06-15 07:48:00 +01:00
Peter Boyle
dc5024e88c
The GPU reduction was not working for me and causing errors. Need to revisit.
...
Gianluca is working on deterministic reduction/
2019-06-08 13:39:11 +01:00
gfilaci
1a82533d22
fix inner product with thrust reduction
2019-05-14 15:35:54 +01:00
Peter Boyle
204a090497
Inner product is not working on GPU. Why?
2019-04-28 07:31:56 +01:00
Peter Boyle
c5e081d69c
Re-Merge branch 'develop' into feature/gpu-port
...
Pull in Regensburg MultiGrid pull request
2019-01-03 01:50:16 +00:00
Peter Boyle
715babeac8
GPU reductions first cut; use thrust, non-reproducible. Inclusive scan can fix this if desired.
...
Local reduction to LatticeComplex and then further reduction.
2019-01-01 13:53:37 +00:00
Peter Boyle
422764757d
Updates in tests to make all of Grid compile
2018-12-14 16:55:54 +00:00
Peter Boyle
b57a4d32aa
Merge branch 'develop' into feature/gpu-port
2018-12-13 05:11:34 +00:00
Peter Boyle
33a0bbb17b
Const correctness
2018-11-19 11:27:57 +00:00
Peter Boyle
e9b6f58fdc
Allow shrinking machine in orthog direction for extract slice local
2018-11-07 23:39:18 +00:00
936eaac8e1
function to get the sha256 string
2018-10-08 19:00:50 +01:00
fb7d021b9d
Hadrons: moving Hadrons to root directory, build system improvements
2018-08-28 15:00:40 +01:00