paboyle
8e99264f40
Accelerator mark up of entire tensore space for offload
2018-01-24 13:27:30 +00:00
paboyle
c037244874
Tensor reformatted with NAMESPACE too
2018-01-13 00:31:02 +00:00
paboyle
3cbe974eb4
Layout
2016-10-20 16:55:21 +01:00
Christopher Kelly
85ed8175cb
Implemented mixed precision CG. Fixed filelist to exclude lib/Old directory and include Config.h.
2016-07-06 15:57:04 -04:00
paboyle
8fd8bc25e9
simd 5th dim with rotation
2016-04-19 15:39:00 -07:00
Peter Boyle
c9fadf97a5
Simplify the compressor interface again.
2016-02-17 18:16:45 -06:00
Peter Boyle
c650bb3f3d
Very small merge speed up.
2016-02-16 18:41:53 -06:00
paboyle
fc6ad65751
Pushed the overlap comms tweaks
2016-01-11 06:34:22 -08:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
paboyle
145a295231
Bug fix for stencil with large shifts (3+), would be important to naik term for example but did not
...
impact Wilson based nearest neighbour stencils.
2015-12-30 19:29:48 +00:00
Peter Boyle
955b482aaf
Partial optimisation of the extraction/merger of simd vecs.
2015-11-06 05:26:20 -06:00
Peter Boyle
d1afebf71e
Sizable improvement in multigrid for unsquared.
...
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01
Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
neo
6e5db0b1da
Corrected bug in integer multiplications for SSE4 and AVX2
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
tests/Make.inc
2015-06-16 23:34:45 +09:00
Azusa Yamaguchi
ef97692622
Handle case of simd_layout not filling whole vector.
...
Useful if real complex live on same grid
2015-06-14 00:55:21 +01:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00