1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 18:51:02 +01:00
Commit Graph

822 Commits

Author SHA1 Message Date
Peter Boyle bfb68e6f02 Merge pull request #130 from giltirn/gparity-handunroll
Gparity handunroll
2017-09-21 10:11:00 +01:00
Christopher Kelly 59bd1fe21b Fix for 'perm' and 'local' not being set for hand-unrolled external-site Dslash, which caused incorrect behavior of G-parity kernel 2017-08-29 13:07:37 -07:00
Christopher Kelly 74af885d4e Removed some no-longer-needed associated with G-parity hand unrolled kernel 2017-08-29 09:50:37 -04:00
paboyle 80c5bce5bb Merge branch 'develop' into feature/multi-communicator 2017-08-25 20:21:26 +01:00
paboyle f68b5de9c8 No compile fix on Clang 2017-08-25 19:35:21 +01:00
Christopher Kelly f365a83fae In G-parity unrolled kernel, replaced calls to permute and exchange with run-time-evaluated permute type with explicit calls to appropriate underlying functions 2017-08-25 14:24:11 -04:00
Peter Boyle c289699d9a updated from cambridge mpi3 shakeout 2017-08-25 11:41:01 +01:00
Peter Boyle c3b1263e75 Benchmark prep 2017-08-25 09:25:54 +01:00
Christopher Kelly 34a9aeb331 Reduced number of if-statement evaluations in G-parity unrolled kernel 2017-08-24 13:53:50 -07:00
paboyle 5fa386ddc9 FFT test compile fixed 2017-08-24 10:17:52 +01:00
Christopher Kelly ce5df177ee Removed superfluous implementation of G-parity twist for hand-unrolled kernel from GparityWilsonImpl 2017-08-23 15:05:22 -04:00
Christopher Kelly a0bb8e5b46 Added hand-unrolled kernel implementations of all the other dslash precision / comms precision combinations with G-parity 2017-08-23 14:44:40 -04:00
Christopher Kelly 46f88e6d72 G-parity hand-unrolled intrinsics twist now uses one less permute and one less temporary 2017-08-23 13:21:10 -04:00
Christopher Kelly b61835c1a5 Added inplace version of intrinsic G-parity twist to hand-unrolled kernel 2017-08-23 12:33:48 -04:00
Azusa Yamaguchi d9cd4f0273 Staggered multinode block cg debugged. Missing global sum.
Code stalls and resumes on KNL at cambridge. Curious.

CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
Christopher Kelly 061e48fd73 Replaced slow unpack-repack in G-parity BC twist with intrinsics version 2017-08-22 18:12:12 -04:00
Christopher Kelly ab50145001 Implemented first, unoptimized version of hand-unrolled G-parity kernels
Improved Test_gparity
2017-08-22 17:12:25 -04:00
paboyle a446d95c33 Trying to pass TeamCity and Travis 2017-08-20 01:10:50 +01:00
paboyle be66e7dd95 Merge branch 'develop' into feature/multi-communicator 2017-08-19 23:12:38 +01:00
Peter Boyle 7d88198387 Merge branch 'develop' into feature/multi-communicator 2017-08-19 13:03:35 -04:00
Guido Cossu 4fe182e5a7 Added high level HMC support for overriding default SIMD lane decomposition 2017-08-06 10:46:19 +01:00
Peter Boyle 14d53e1c9e Threaded MPI calls patches 2017-07-29 13:08:10 -04:00
Guido Cossu c0485d799d Explicit parameter declaration in the WilsonGauge test 2017-07-26 16:26:04 +01:00
Guido Cossu 7abc5613bd Added smearing to the topological charge observable 2017-07-26 16:21:17 +01:00
Christopher Kelly 0f214ad427 Moved FourierAcceleratedGaugeFixer into Grid::QCD namespace and removed 'using namespace' directives from header 2017-07-21 11:13:51 -04:00
Guido Cossu 8859a151cc Small corrections to the NEON port 2017-06-29 11:30:29 +01:00
portelli 07de925127 minor scalar action fixes 2017-06-28 12:45:44 +01:00
portelli 7d7220cbd7 scalar: lambda/4! convention 2017-06-27 14:38:45 +01:00
paboyle 54e94360ad Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit 2017-06-24 23:10:24 +01:00
portelli 0af740dc15 minor scalar HMC code improvement 2017-06-24 23:04:05 +01:00
portelli d2e8372df3 SU(N) algebra fix (was not working) 2017-06-24 23:03:39 +01:00
portelli b22eab8c8b Merge commit 'a7d56523abee6c9030fdd9303c79954897b1086f' into feature/hadrons 2017-06-21 18:32:48 +01:00
paboyle 0486ff8e79 Improved the lancos 2017-06-20 18:46:01 +01:00
portelli 1e8a2e1621 various compatibility fixes after merge 2017-06-20 17:24:55 +01:00
portelli 7587df831a Merge branch 'develop' into feature/hadrons
# Conflicts:
#	lib/qcd/action/scalar/ScalarImpl.h
2017-06-20 15:50:39 +01:00
Azusa Yamaguchi 0a8faac271 Fix make tests compile 2017-06-19 22:54:18 +01:00
Azusa Yamaguchi 3fa5e3109f Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2017-06-19 14:01:44 +01:00
paboyle c85024683e Merge branch 'feature/parallelio' into develop 2017-06-19 01:39:48 +01:00
paboyle 1300b0b04b Update to enable multiple records per file more consistent with SciDAC.
open, close, write records...
2017-06-19 01:01:48 +01:00
paboyle ae39ec85a3 ComplexField defined 2017-06-18 00:12:48 +01:00
paboyle 46879e1658 Complex defined in Impl even for gauge. 2017-06-18 00:11:45 +01:00
portelli 81b18f843a Merge branch 'feature/scalar_adjointFT' into feature/hadrons
# Conflicts:
#	lib/qcd/action/scalar/ScalarImpl.h
2017-06-16 17:59:55 +01:00
paboyle 3bfd1f13e6 I/O improvements 2017-06-11 23:14:10 +01:00
Azusa Yamaguchi 70ab598c96 Move gfix into utils 2017-06-08 22:22:23 +01:00
portelli f6aa82b7f2 Merge branch 'develop' into feature/hadrons 2017-06-06 11:46:33 -05:00
portelli 0503c028be Merge branch 'feature/qed-fvol' into feature/hadrons (non-trivial conflicts on scalar Impl)
# Conflicts:
#	configure.ac
#	lib/qcd/action/scalar/Scalar.h
2017-06-05 16:37:47 -05:00
Guido Cossu 7da4856e8e Wilson flow with adaptive steps 2017-06-02 16:55:53 +01:00
Guido Cossu aaf1e33a77 Adding adaptive integration in the WilsonFlow 2017-06-02 16:32:35 +01:00
paboyle 094c3d091a Improved and RNG's now survive checkpoint 2017-06-02 00:38:58 +01:00
Guido Cossu 7c6cc85df6 Updating WilsonFlow test 2017-05-27 18:03:49 +01:00