Christopher Kelly
|
ce5df177ee
|
Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl
|
2017-08-23 15:05:22 -04:00 |
|
Christopher Kelly
|
a0bb8e5b46
|
Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity
|
2017-08-23 14:44:40 -04:00 |
|
Christopher Kelly
|
46f88e6d72
|
G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary
|
2017-08-23 13:21:10 -04:00 |
|
Christopher Kelly
|
b61835c1a5
|
Added inplace version of intrinsic G-parity twist to hand-unrolled kernel
|
2017-08-23 12:33:48 -04:00 |
|
Christopher Kelly
|
061e48fd73
|
Replaced slow unpack-repack in G-parity BC twist with intrinsics version
|
2017-08-22 18:12:12 -04:00 |
|
Christopher Kelly
|
ab50145001
|
Implemented first, unoptimized version of hand-unrolled G-parity kernels
Improved Test_gparity
|
2017-08-22 17:12:25 -04:00 |
|
|
7587df831a
|
Merge branch 'develop' into feature/hadrons
# Conflicts:
# lib/qcd/action/scalar/ScalarImpl.h
|
2017-06-20 15:50:39 +01:00 |
|
paboyle
|
46879e1658
|
Complex defined in Impl even for gauge.
|
2017-06-18 00:11:45 +01:00 |
|
|
0503c028be
|
Merge branch 'feature/qed-fvol' into feature/hadrons (non-trivial conflicts on scalar Impl)
# Conflicts:
# configure.ac
# lib/qcd/action/scalar/Scalar.h
|
2017-06-05 16:37:47 -05:00 |
|
Guido Cossu
|
9c12c37aaf
|
Confirming the fix on the complex boundary conditions
|
2017-05-09 08:41:29 +01:00 |
|
paboyle
|
529e78d43f
|
Restart the v0.7.0 release
|
2017-05-08 18:20:04 +01:00 |
|
paboyle
|
2439999ec8
|
Warning elimination; drop to -O2 on G++ bad versions
|
2017-05-06 14:44:49 +01:00 |
|
paboyle
|
1d96f662e3
|
Fixed 4d fermion gparity force. Put strong tests on make check force tests
|
2017-05-06 00:46:31 +01:00 |
|
Guido Cossu
|
20999c1370
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-05-05 12:47:17 +01:00 |
|
paboyle
|
78ef10e60f
|
Mobius force improvement
|
2017-05-04 19:53:21 +01:00 |
|
paboyle
|
90f6bc16bb
|
No compile clang fix
|
2017-05-04 12:15:06 +01:00 |
|
Peter Boyle
|
422cdf4979
|
Some checks
|
2017-05-03 18:37:38 -04:00 |
|
Peter Boyle
|
38db174f3b
|
Print statement
|
2017-05-03 18:25:26 -04:00 |
|
Guido Cossu
|
4063238943
|
Adding HMC test file example for Mobius + smearing
|
2017-05-01 13:44:00 +01:00 |
|
Guido Cossu
|
3344788fa1
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-05-01 12:13:56 +01:00 |
|
Peter Boyle
|
99220f6531
|
Fixes and better timing
|
2017-04-26 17:24:11 -04:00 |
|
Peter Boyle
|
f8797e1e3e
|
bug fix. works now and great face performance
|
2017-04-26 03:14:02 -04:00 |
|
Peter Boyle
|
fd1eb7de13
|
Clean implementation of the exterior faces listing only those points on the boudary
|
2017-04-26 02:34:52 -04:00 |
|
Peter Boyle
|
2ce898efa3
|
Pretty code
|
2017-04-26 02:34:25 -04:00 |
|
paboyle
|
ab66bac4e6
|
Think I'm getting on top of the reduced cost exterior precomputed list of links
|
2017-04-25 08:50:26 +01:00 |
|
paboyle
|
56277a11c8
|
Build a list of whats on the surface
|
2017-04-24 17:06:15 +01:00 |
|
Peter Boyle
|
5b55867a7a
|
Slightly cheaper Ext assembly
|
2017-04-24 05:36:11 -04:00 |
|
Peter Boyle
|
3accb1ef89
|
Debugged assemply split phase with interior suppression
|
2017-04-23 19:30:19 -04:00 |
|
Peter Boyle
|
e3d0e31525
|
Debugged assemply split phase with interior suppression
|
2017-04-23 19:29:27 -04:00 |
|
Peter Boyle
|
5812eb8a8c
|
Partially fixed. But the comms-overlap does not work yet.
|
2017-04-22 18:50:25 -04:00 |
|
paboyle
|
ac58565d0a
|
Dangerous rewrite of the assembly. If I make a mistake the debug will be painful.
|
2017-04-22 19:31:04 +01:00 |
|
paboyle
|
b722889234
|
Try a better load balancing loop
|
2017-04-22 19:27:41 +01:00 |
|
paboyle
|
abba44a837
|
Hand unrolled for overlapped comms
|
2017-04-22 17:45:17 +01:00 |
|
paboyle
|
f301be94ce
|
Fixed
|
2017-04-22 17:42:31 +01:00 |
|
Peter Boyle
|
1d1b225497
|
Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).
|
2017-04-22 09:05:28 -04:00 |
|
Peter Boyle
|
53a785a3dd
|
Fixing the KNL compile
|
2017-04-22 08:11:51 -04:00 |
|
paboyle
|
736bf3c866
|
Major rework of stencil. Half precision and MPI3 now working.
|
2017-04-22 11:33:50 +01:00 |
|
paboyle
|
fc4ab9ccd5
|
Working half precision comms
|
2017-04-20 11:20:26 +01:00 |
|
paboyle
|
4a340aa5ca
|
Massive compressor rework to support reduced precision comms
|
2017-04-20 09:28:27 +01:00 |
|
|
a6a0da873f
|
Merge branch 'feature/hadrons' into feature/qed-fvol
|
2017-04-13 15:31:06 +01:00 |
|
paboyle
|
42fb49d3fd
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2017-04-13 14:12:47 +01:00 |
|
|
8ef4300412
|
spurious .dirstamp files removed
|
2017-04-10 17:00:22 +01:00 |
|
paboyle
|
db5f6d3ae3
|
Verbose fix
|
2017-04-09 23:41:30 +09:00 |
|
paboyle
|
86aaa35294
|
Christoph needs SchurDiagTwoKappa which is mobius specific.
|
2017-04-07 11:07:40 +09:00 |
|
Guido Cossu
|
8c540333d5
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-04-05 14:41:04 +01:00 |
|
paboyle
|
1c4bc7ed38
|
Debugged staggered conventions
|
2017-03-31 14:41:48 +09:00 |
|
paboyle
|
9fd23faadf
|
Pretty layout
|
2017-03-30 13:44:45 +09:00 |
|
paboyle
|
10e4fa0dc8
|
Template instantiation improvements
|
2017-03-30 13:44:25 +09:00 |
|
paboyle
|
c4aca1dde4
|
Conjugate coefficients on adjoint
|
2017-03-30 13:44:05 +09:00 |
|
paboyle
|
b9e8ea3aaa
|
conjugate coefficient on the dagger
|
2017-03-30 13:43:13 +09:00 |
|