1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-25 11:12:02 +01:00
Commit Graph

168 Commits

Author SHA1 Message Date
bd310932f7 changes 2020-04-09 16:32:31 +02:00
77fa586f6c introduced A64FX Wilson kernels 2020-04-09 13:30:06 +02:00
a2188ea875 remove debugging printf from WilsonKernelsImplementation 2020-03-26 09:12:36 -04:00
c9b737a4e7 make trace,adj,transpose unary operators 2020-03-16 17:58:30 -04:00
7c061e20c9 All directions of dirac operator for fastt coarsening 2020-01-27 12:40:13 -05:00
e5d1c09665 Faster DhopDirAll for little dirac operator coarsening 2020-01-27 12:38:54 -05:00
8016a465ae Remove extraneous variable 2020-01-27 12:35:37 -05:00
e583035614 Change to interface to minise comms in evaluating coarse space operator 2020-01-06 11:43:59 -05:00
3c3d6a94f3 OPtimising the force term a bit 2020-01-04 03:16:23 -05:00
f7373e97a4 Missing conjugate in MooeeInvDag 2019-12-16 10:05:50 +01:00
848079e8ba Merge pull request #235 from grid-test-organisation/feature/5d-improvement
MooeeInv and M5D optimisations + enable threading with nvcc
2019-12-10 21:45:03 -05:00
9b6b0caa55 Junk commit fix 2019-12-09 03:01:58 -05:00
3d2fe80780 Temporary size depends on checkerboard/uncheckerboard. The Mdir cares 2019-12-09 02:58:24 -05:00
a7fa86dc29 MooeeInv improvement for DW EOFA + comments 2019-09-05 12:05:21 +01:00
fdd9b14e82 speed up MooeeInvDag for DWF EOFA 2019-09-02 14:49:51 +01:00
e66669d300 fast MooeeInv for EOFA 2019-09-02 14:26:13 +01:00
0efaf3c4fa access M5D coeffs through pointers 2019-09-02 11:33:00 +01:00
3ef519aaa4 fast MooeeInv 2019-09-02 11:18:14 +01:00
48e6efc7c9 Merge branch 'develop' into feature/gpu-port
Conflicts:
	Grid/qcd/action/fermion/WilsonKernelsAsm.cc
	Grid/qcd/action/fermion/implementation/ImprovedStaggeredFermionImplementation.h
	Grid/qcd/action/fermion/implementation/StaggeredKernelsAsm.h
	benchmarks/Benchmark_comms.cc
2019-08-14 18:56:54 +01:00
53e3ab4131 Fix force term 2019-08-11 11:06:13 +01:00
1282e1067f Do the force term on the accelerator too. Needed particularly because comms buffers
are device memory.
2019-07-29 22:58:35 +01:00
fe700a183a Getting HMC to run 2019-07-26 12:18:29 +01:00
fa9cd50c5b Merge branch 'develop' into feature/gpu-port 2019-07-16 11:55:17 +01:00
bd155ca5c0 Overlap comms with comput now supported 2019-07-12 09:09:40 +01:00
d7b3efe893 Compile fix 2019-06-15 17:03:15 +01:00
decc99ca76 Accelerator version 2019-06-15 12:43:00 +01:00
464cd65931 Still to test this fully 2019-06-15 12:35:14 +01:00
a1ec2f4723 Still to test this routine fully 2019-06-15 12:33:55 +01:00
ea9662ec85 Thread loop changes 2019-06-15 09:09:57 +01:00
52c74f1cac Thread loop changes 2019-06-15 09:08:16 +01:00
9a13d2992c lean up 2019-06-15 09:05:16 +01:00
b0449ae270 Thread loop changes 2019-06-15 09:04:19 +01:00
1299225105 Accelerator loop changes 2019-06-15 09:03:46 +01:00
5925e7f405 Thread for changes 2019-06-15 09:01:30 +01:00
36f06555a2 Simplify Impl 2019-06-09 22:26:27 +01:00
d6c0e0756d Remove GPU version 2019-06-09 11:23:42 +01:00
3e41b1055c Remove Gpu only kernels. 2019-06-09 11:20:01 +01:00
e78a5e7838 ASM instantiation without link errors 2019-06-09 01:25:21 +01:00
c933ac2248 Temporarily introduce a SIMT_loop to test out approaches prior to making a global change to
accelerator_loop
2019-06-08 13:44:27 +01:00
ad2c433574 Instantiations move. Tried using Gianluca's suggestion about avoiding threadIdx but doesn't
seem to make a difference. Will revisit this and probably remove the lane parameter from the coalescedRead
2019-06-08 13:43:12 +01:00
86e7fb6e86 Instantiation relocation 2019-06-08 13:42:46 +01:00
fb91dda7be Hand instantiation moved location 2019-06-08 13:42:26 +01:00
82cf7bc5ab Move instantiation into fermion/instantiation 2019-06-08 13:41:46 +01:00
e452cc0a22 Move static variables into instantiation .cc file 2019-06-08 13:41:20 +01:00
4d2b938166 Remove explict instantiation from here 2019-06-08 13:41:01 +01:00
10d16ab76c Remove explict instantiation from here 2019-06-08 13:40:32 +01:00
0ee6e77cbc Compiles GPU and CPU, still gives good performance on CPU 2019-06-05 13:28:16 +01:00
7323099966 Instatiation fix 2019-06-05 00:14:38 +01:00
6379651cdd Generic or GPU ready for benchmark test on GPU 2019-06-05 00:13:52 +01:00
ba4fd756b9 Fix signature, but deprecating this loops style 2019-06-05 00:12:36 +01:00