Michael Marshall
76af169f05
Add global namespace to Writer<T> and Reader<T> inside GRID_SERIALIZABLE_CLASS_MEMBERS (so that "using Grid" not necessary).
...
Fix issue with output of Grid::iMatrix so that M<3>{{148,149,150,} {151,152,153,} {154155156}} becomes M<3>{{148,149,150} {151,152,153} {154,155,156}}
2021-05-31 08:43:02 +01:00
u61464
15ae317858
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2021-05-04 08:40:38 -07:00
u61464
834f536b5f
Fastest option on SyCL is now std::complex
2021-05-04 08:40:18 -07:00
Peter Boyle
e947992957
Improved force terms
2021-03-29 20:04:06 +02:00
Peter Boyle
a76cb005e0
Update Tensor_exp.h
2021-03-08 13:37:57 -05:00
u61464
679d1d22f7
Sycl happier
2021-03-03 11:21:43 -08:00
Peter Boyle
f9b1f240f6
Better SIMD usage/coalescence
2021-02-26 17:51:41 +01:00
Peter Boyle
99445673f6
Gparity fix, and plaquette IO
2021-01-14 21:00:36 -05:00
Peter Boyle
5adae5d6ff
Unused variable remove
2020-11-19 19:22:12 +01:00
Peter Boyle
cc9c993f74
Project on group fix on GPU tracked to reciprocal sqrt collision between CUDA and Grid rsqrt
2020-10-31 18:12:47 -04:00
Peter Boyle
6c31b99f1f
I knew coupling Eigen Tensor to Grid serialisation was a bad iddea.
...
Now the complex is different on GPU creates probblems
2020-08-31 23:49:19 -04:00
Christoph Lehner
968a90633a
Zero -> zeroit in Tensor_index
2020-07-31 02:07:17 -04:00
Peter Boyle
936c5ecf69
Reduction GPU no compile fix
2020-06-24 17:28:31 -04:00
Peter Boyle
cdf0a04fc5
Merge branch 'develop' into sycl
2020-06-09 04:00:12 -04:00
Peter Boyle
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Peter Boyle
28a1fcaaff
First compile against SYCL
2020-05-05 11:13:27 -07:00
Christoph Lehner
ddb192bac7
re-work double precision promotion for summit
2020-04-30 16:09:57 -04:00
Christoph Lehner
f1fe444d4f
blocked precision promotion infrastructure upgrade
2020-04-24 06:27:20 -04:00
Christoph Lehner
091d5c605e
towards more precise blocking
2020-04-17 04:25:28 -04:00
Peter Boyle
b473405652
Tensor ambiguous fix
2019-08-29 09:36:41 -05:00
Peter Boyle
28d6be2a4e
Fix GCC complaint
2019-08-22 18:56:37 +01:00
Peter Boyle
96ac56cace
Double precision variants for summation accuracy
2019-08-14 13:08:01 +01:00
Peter Boyle
a23dc295ac
Remove compiler errors and warnings
2019-07-18 14:47:02 +01:00
Peter Boyle
08904f830e
Merge develop
2019-07-16 11:59:56 +01:00
Peter Boyle
fa9cd50c5b
Merge branch 'develop' into feature/gpu-port
2019-07-16 11:55:17 +01:00
Peter Boyle
d6ffadb33b
Coalesced write
2019-07-02 17:25:13 +01:00
Peter Boyle
b8f7bfbb26
Dont stream as poor perf in some cases
2019-07-01 07:30:25 +01:00
Peter Boyle
d976e5c514
Pow is being awkward in thrust for reasons I don't understand. Possible thrust bug.
2019-06-16 12:05:11 +01:00
Peter Boyle
b285138be4
Better checking on types
2019-06-15 08:27:48 +01:00
Peter Boyle
29a244e423
Test of using a lane variable instead of repeated reference to threadIdx.y
2019-06-08 13:46:26 +01:00
Peter Boyle
0ee6e77cbc
Compiles GPU and CPU, still gives good performance on CPU
2019-06-05 13:28:16 +01:00
Peter Boyle
8794d35c78
GPU
2019-06-04 20:52:27 +01:00
Peter Boyle
6e2e904a0e
NVCC compiles happy. Start to develop strategy for writing generic
...
code for GPU kernels and CPU kernels.
2019-06-04 20:46:35 +01:00
Peter Boyle
ffde81f22a
Nsimd() and coalesced support
2019-05-25 12:44:07 +01:00
Peter Boyle
d8098f1ecd
coalesced support
2019-05-25 12:43:31 +01:00
Michael Marshall
12d8bf1ced
Eigen::Tensor serialisation. Tested on single and double precision builds
2019-03-20 22:27:41 +00:00
91cffef883
Updates after review with Peter.
2019-03-07 14:30:35 +00:00
b7db99967a
Recommendations for Traits classes
2019-02-28 20:06:59 +00:00
Peter Boyle
e73b909a48
Make tests running past nvcc. Different NVCC versions proving tricky to keep happy. This is 9.2
2019-01-02 12:05:30 +00:00
Peter Boyle
8c91e82ee8
GPU clean up, remove parallel_for. Split into accelerator_loop, thread_loop
...
cases, and collides with parallel_for in thrust
2019-01-01 15:06:46 +00:00
Peter Boyle
422764757d
Updates in tests to make all of Grid compile
2018-12-14 16:55:54 +00:00
Peter Boyle
b57a4d32aa
Merge branch 'develop' into feature/gpu-port
2018-12-13 05:11:34 +00:00
fb7d021b9d
Hadrons: moving Hadrons to root directory, build system improvements
2018-08-28 15:00:40 +01:00