1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-05-02 08:24:12 +01:00
Commit Graph

611 Commits

Author SHA1 Message Date
Peter Boyle 8cfd5d2639 Need lattice view 2020-06-03 09:11:28 -04:00
Peter Boyle 1c9f20b15e Views must be closed 2020-06-03 09:10:29 -04:00
Peter Boyle 32237895bd Reorg memory manager for O(1) hash table 2020-06-03 09:09:52 -04:00
Peter Boyle 1d252d0922 Accelerator inline 2020-05-28 11:45:25 -04:00
Peter Boyle 006cc8a8f1 Staggereed move to accelerator 2020-05-28 08:33:06 -04:00
Peter Boyle ee63721bad int unhappiness sycl fix 2020-05-25 08:36:24 -07:00
Peter Boyle 22c5168d70 Sycl happier 2020-05-25 08:35:56 -07:00
Peter Boyle 949ac3cd24 Must avoid non-trivial copy constructors 2020-05-25 08:35:28 -07:00
Peter Boyle 7bc0166c1c SYCLL maknig happy - must avoid non ttrivial copy constructors 2020-05-25 08:34:19 -07:00
Peter Boyle cb0d1b3399 hopefullly fix buildd fail 2020-05-24 21:27:00 -04:00
Peter Boyle d1f1ccc705 HIP changes 2020-05-24 21:18:49 -04:00
Peter Boyle c7519a237a Assertions fail on HIP foor unknown reasons - dedbugging 2020-05-24 14:02:47 -04:00
Peter Boyle 32be2b13d3 Updates for HiP 2020-05-24 14:00:55 -04:00
Peter Boyle 92b342a477 Hip reduction too 2020-05-24 13:50:28 -04:00
Peter Boyle 556da86ac3 HIP fp16 2020-05-24 13:41:58 -04:00
Peter Boyle 8285e41574 View location / access mode 2020-05-21 16:14:41 -04:00
Peter Boyle 7860a50f70 Make view specify where and drive data motion - first cut.
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
Peter Boyle ebb60330c9 Automatic data motion options beginning 2020-05-17 16:34:25 -04:00
Peter Boyle a9847aa866 Dependence fix 2020-05-12 20:03:37 -04:00
Peter Boyle d24d8e8398 Use X-direction as more bits meaningful on CUDA.
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume

e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.

Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
Peter Boyle 07c0c02f8c Speed up Cshift 2020-05-11 17:02:01 -04:00
Peter Boyle 8c31c065b5 Keep the Vector fixed to protect it from realloc 2020-05-11 17:00:30 -04:00
Peter Boyle bbbee5660d First compiile on HiP 2020-05-10 05:28:09 -04:00
Peter Boyle 52081acfa5 NVCC compile fixes 2020-05-08 13:14:12 -04:00
Peter Boyle f8b8e00090 Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
Aim to reduce the amount of cuda and other code variations floating around all over the place.

Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Peter Boyle 28a1fcaaff First compile against SYCL 2020-05-05 11:13:27 -07:00
u37294 04927d2e40 SYCL prep - no sycl just make it compile through DPC++ 2020-05-04 10:28:29 -07:00
u37294 7caed4edd9 dpc++ didn't like rdtsc() 2020-05-04 10:27:05 -07:00
Peter Boyle 9b2d2d0fc3 Basis rotate stack passig to GPU reduction 2020-04-30 12:31:07 -04:00
Peter Boyle 5011753f4f Clean up warning 2020-04-30 10:23:48 -04:00
Peter Boyle dd3ebc2ce4 Slow compile on NVCC switch off conserved current 2020-04-29 08:43:12 -04:00
Peter Boyle c2c3cad20d Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2020-04-23 04:35:42 -04:00
Peter Boyle edec9ee2e2 Conserved current rewrite done. Zmobius working 2020-04-23 04:34:01 -04:00
Christopher Kelly 0896f2cead Added missing include guards in bigfloat_double.h 2020-04-20 10:30:38 -04:00
Christopher Kelly 181709bba4 Merge branch 'develop' into feature/zmobius_paramcompute 2020-04-20 09:12:34 -04:00
Peter Boyle 90229cfb0f Merge pull request #270 from milc-qcd/feature/CGinfo
feature/CGinfo
2020-04-16 11:46:08 -04:00
Peter Boyle 0475c46ecb Merge pull request #256 from djm2131/feature/BiCGSTAB
Import BiCGSTAB solvers and tests
2020-04-16 11:45:15 -04:00
Peter Boyle f3a8d039a2 Merge branch 'feature/hdcr' into develop 2020-04-10 22:01:52 -04:00
Peter Boyle 3b0e07882f Adding another form of polynomial 2020-04-10 11:28:33 -04:00
Peter Boyle 8e81a811d0 Merge branch 'feature/hdcr' into develop 2020-04-10 11:14:49 -04:00
Peter Boyle aa13118127 Missing conjugate already fixed in develop 2020-04-10 11:11:24 -04:00
Peter Boyle 6cdb09c884 Faster copy region 2020-04-10 11:10:52 -04:00
Peter Boyle a65bc64f10 Accelerator peek poke 2020-04-10 11:09:59 -04:00
Peter Boyle 11dec4883c Don't throw assert 2020-04-10 11:09:11 -04:00
Peter Boyle afa458c812 Extra solvers 2020-04-10 11:08:19 -04:00
Peter Boyle dc50190b8f Faster GPU basis rotation
May need to later include Regensburg optimised CPU variant
2020-04-10 11:06:04 -04:00
Carleton DeTar 165c68e28e Change TrueResiduals to TrueResidualShift and IterationsToComplete to IterationsToCompleteShift 2020-02-29 17:51:51 -06:00
Carleton DeTar 9479bc8486 Make IterationsToComplete and TrueResidual externally accessible 2020-02-19 17:43:57 -06:00
Peter Boyle 8a5c13d5fb Still fast moving in changes 2020-02-06 17:57:26 -05:00
Peter Boyle bdccb0c91f Working 2 types of decomposition 2020-02-06 17:26:55 -05:00