8cfd5d2639
Need lattice view
2020-06-03 09:11:28 -04:00
1c9f20b15e
Views must be closed
2020-06-03 09:10:29 -04:00
32237895bd
Reorg memory manager for O(1) hash table
2020-06-03 09:09:52 -04:00
1d252d0922
Accelerator inline
2020-05-28 11:45:25 -04:00
006cc8a8f1
Staggereed move to accelerator
2020-05-28 08:33:06 -04:00
ee63721bad
int unhappiness sycl fix
2020-05-25 08:36:24 -07:00
22c5168d70
Sycl happier
2020-05-25 08:35:56 -07:00
949ac3cd24
Must avoid non-trivial copy constructors
2020-05-25 08:35:28 -07:00
7bc0166c1c
SYCLL maknig happy - must avoid non ttrivial copy constructors
2020-05-25 08:34:19 -07:00
cb0d1b3399
hopefullly fix buildd fail
2020-05-24 21:27:00 -04:00
d1f1ccc705
HIP changes
2020-05-24 21:18:49 -04:00
c7519a237a
Assertions fail on HIP foor unknown reasons - dedbugging
2020-05-24 14:02:47 -04:00
32be2b13d3
Updates for HiP
2020-05-24 14:00:55 -04:00
92b342a477
Hip reduction too
2020-05-24 13:50:28 -04:00
556da86ac3
HIP fp16
2020-05-24 13:41:58 -04:00
8285e41574
View location / access mode
2020-05-21 16:14:41 -04:00
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
a9847aa866
Dependence fix
2020-05-12 20:03:37 -04:00
d24d8e8398
Use X-direction as more bits meaningful on CUDA.
...
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume
e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.
Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
07c0c02f8c
Speed up Cshift
2020-05-11 17:02:01 -04:00
8c31c065b5
Keep the Vector fixed to protect it from realloc
2020-05-11 17:00:30 -04:00
bbbee5660d
First compiile on HiP
2020-05-10 05:28:09 -04:00
52081acfa5
NVCC compile fixes
2020-05-08 13:14:12 -04:00
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
28a1fcaaff
First compile against SYCL
2020-05-05 11:13:27 -07:00
04927d2e40
SYCL prep - no sycl just make it compile through DPC++
2020-05-04 10:28:29 -07:00
7caed4edd9
dpc++ didn't like rdtsc()
2020-05-04 10:27:05 -07:00
9b2d2d0fc3
Basis rotate stack passig to GPU reduction
2020-04-30 12:31:07 -04:00
5011753f4f
Clean up warning
2020-04-30 10:23:48 -04:00
dd3ebc2ce4
Slow compile on NVCC switch off conserved current
2020-04-29 08:43:12 -04:00
c2c3cad20d
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-04-23 04:35:42 -04:00
edec9ee2e2
Conserved current rewrite done. Zmobius working
2020-04-23 04:34:01 -04:00
0896f2cead
Added missing include guards in bigfloat_double.h
2020-04-20 10:30:38 -04:00
181709bba4
Merge branch 'develop' into feature/zmobius_paramcompute
2020-04-20 09:12:34 -04:00
90229cfb0f
Merge pull request #270 from milc-qcd/feature/CGinfo
...
feature/CGinfo
2020-04-16 11:46:08 -04:00
0475c46ecb
Merge pull request #256 from djm2131/feature/BiCGSTAB
...
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
f3a8d039a2
Merge branch 'feature/hdcr' into develop
2020-04-10 22:01:52 -04:00
3b0e07882f
Adding another form of polynomial
2020-04-10 11:28:33 -04:00
8e81a811d0
Merge branch 'feature/hdcr' into develop
2020-04-10 11:14:49 -04:00
aa13118127
Missing conjugate already fixed in develop
2020-04-10 11:11:24 -04:00
6cdb09c884
Faster copy region
2020-04-10 11:10:52 -04:00
a65bc64f10
Accelerator peek poke
2020-04-10 11:09:59 -04:00
11dec4883c
Don't throw assert
2020-04-10 11:09:11 -04:00
afa458c812
Extra solvers
2020-04-10 11:08:19 -04:00
dc50190b8f
Faster GPU basis rotation
...
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
165c68e28e
Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift
2020-02-29 17:51:51 -06:00
9479bc8486
Make IterationsToComplete and TrueResidual externally accessible
2020-02-19 17:43:57 -06:00
8a5c13d5fb
Still fast moving in changes
2020-02-06 17:57:26 -05:00
bdccb0c91f
Working 2 types of decomposition
2020-02-06 17:26:55 -05:00