263dcbabab
Simplify the comms benchmark
2019-07-30 22:51:04 +01:00
0ba3d469c7
Benchmark IO in single and double precision
2018-10-17 20:27:34 +01:00
291bc2a1f0
IO benchmark on a list of directories
2018-10-15 17:25:08 +01:00
a15a2dfd29
Merge branch 'develop' into feature/hadrons
2018-08-10 16:08:22 +01:00
27cdb79063
Sha used to seed from a unique string
2018-08-10 15:11:01 +01:00
00b92a91b5
Optimising
2018-07-28 23:46:22 +01:00
65533741f7
7 moms
2018-07-28 16:17:47 +01:00
131a6785d4
Merge branch 'feature/hadrons-a2a' into feature/hadrons-a2a
2018-07-27 23:03:42 +01:00
44f4f5c8e2
Momentum loop
2018-07-27 23:00:16 +01:00
2679df034f
Changes to meson field benchmark. Now includes the gammas in the final part of the naive method, both methods compute
...
lhs^dag*Gamma*rhs (previously Gamma*lhs^dag*rhs), and checks results.
2018-07-27 18:31:10 +01:00
71e1006ba8
Updated meson field benchmark for dirac structures
2018-07-26 09:09:29 +01:00
24128ff109
Changes needed for MF benchmark to work with comms correctly
2018-07-23 15:51:37 +01:00
ec9939c1ba
Test for faster implementation of meson field inner loop
...
This should be possible to cache block at outer levels, global sum across nodes not performed
and deferred to caller to block them all into a big all reduce.
Nc=3 and Fermion is hard coded in an ugly way. We might think about benchmarking whether
a product without the conjugate should be made available by Grid.
It is not clear whether the explicit unroll, or the performing of conjugate on left once
was the real source of the speed up.
Gives 70-80 GF/s on my laptop (single) half that double, and 70GB/s to cache.
This is competitive with dslash and a reasonable stopping point for the optimisation. If necessary we can revisit.
2018-07-10 12:38:51 +01:00
bfbf2f1fa0
no threaded stencil benchmark if OpenMP is not supported
2018-05-03 16:20:01 +01:00
1dddd17e3c
Benchmark improvements from tesseract
2018-04-27 11:44:46 +01:00
fa0d8feff4
Performance of CovariantCshift now non-embarrassing.
2018-04-26 17:56:27 +01:00
05b44aef6b
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
...
Conflicts:
benchmarks/Benchmark_su3.cc
2018-04-26 15:38:49 +01:00
91a0a3f820
Improvement
2018-04-26 14:48:35 +01:00
8f44c799a6
Saving the benchmarking tests for Cshift
2018-04-26 14:48:03 +01:00
43f5a0df50
More timers in the integrator
2018-04-26 12:01:56 +09:00
2baf193031
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2018-04-25 00:14:03 +01:00
362ba0443a
Cshift updates
2018-04-25 00:12:11 +01:00
c5b9147b53
Correction of a minor bug in the su3 benchmark
2018-04-24 08:03:57 -07:00
a1be533329
Corrected Flop count in Benchmark su3 and expanded the Wilson flow output
2018-04-24 01:19:53 -07:00
b5510427f9
physical fermion interface, cshift benchmark in SU3.
2018-04-18 01:43:29 +01:00
276f113f28
IO uses master boss node for metadata.
2018-03-30 16:17:05 +01:00
ab6afd18ac
Still compile if no LIME
2018-03-30 13:39:20 +01:00
c5a885dcd6
I/O benchmark
2018-03-29 19:57:41 +01:00
fb24e3a7d2
Adding utilities for perf profiling
2018-01-29 11:11:45 +01:00
cff3bae155
Adding support for general Nc in the benchmark outputs
2018-01-25 13:46:31 +01:00
9b32d51cd1
Simplify comms layer proliferatoin
2018-01-08 11:27:14 +00:00
4f8b6f26b4
Merge branch 'develop' into feature/dwf-multirhs
2017-10-02 11:41:49 +01:00
bfb68e6f02
Merge pull request #130 from giltirn/gparity-handunroll
...
Gparity handunroll
2017-09-21 10:11:00 +01:00
17c5b0f152
Patching comparison point
2017-09-16 18:18:07 +01:00
b331be9101
Better reporting
2017-08-31 11:32:57 +01:00
49c20a9fa8
Patch to reporting
2017-08-31 11:32:21 +01:00
7359df3501
Full reporting for benchmark; save robustness factor
2017-08-31 10:42:35 +01:00
d36d2fb40d
Added ability to override default Ls in Benchmark_dwf
2017-08-28 06:53:56 -07:00
5b9267e88d
Cleaner comms benchmark treatment for one node runs
2017-08-27 18:24:48 -04:00
15fd4003ef
Improving presentation of results
2017-08-27 13:46:02 +01:00
ad89abb018
Fix
2017-08-25 20:43:37 +01:00
80c5bce5bb
Merge branch 'develop' into feature/multi-communicator
2017-08-25 20:21:26 +01:00
d0f3d525d5
Optimal block size for KNL
2017-08-25 19:33:54 +01:00
3a58217405
Updated
2017-08-25 14:29:53 +01:00
c289699d9a
updated from cambridge mpi3 shakeout
2017-08-25 11:41:01 +01:00
c3b1263e75
Benchmark prep
2017-08-25 09:25:54 +01:00
edabb3577f
Imported Benchmark_gparity
2017-08-23 16:54:06 -04:00
ae56e556c6
finalise issue on new OPA revert
2017-08-20 02:53:12 +01:00
383ca7d392
Switch off comms for now until feature/multi-communicator is merged
2017-08-20 01:27:48 +01:00
a446d95c33
Trying to pass TeamCity and Travis
2017-08-20 01:10:50 +01:00