Dr Peter Boyle
|
1dddd17e3c
|
Benchmark improvements from tesseract
|
2018-04-27 11:44:46 +01:00 |
|
Peter Boyle
|
fa0d8feff4
|
Performance of CovariantCshift now non-embarrassing.
|
2018-04-26 17:56:27 +01:00 |
|
Peter Boyle
|
05b44aef6b
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
Conflicts:
benchmarks/Benchmark_su3.cc
|
2018-04-26 15:38:49 +01:00 |
|
Peter Boyle
|
91a0a3f820
|
Improvement
|
2018-04-26 14:48:35 +01:00 |
|
Peter Boyle
|
8f44c799a6
|
Saving the benchmarking tests for Cshift
|
2018-04-26 14:48:03 +01:00 |
|
Guido Cossu
|
43f5a0df50
|
More timers in the integrator
|
2018-04-26 12:01:56 +09:00 |
|
paboyle
|
2baf193031
|
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
|
2018-04-25 00:14:03 +01:00 |
|
paboyle
|
362ba0443a
|
Cshift updates
|
2018-04-25 00:12:11 +01:00 |
|
Guido Cossu
|
c5b9147b53
|
Correction of a minor bug in the su3 benchmark
|
2018-04-24 08:03:57 -07:00 |
|
Guido Cossu
|
a1be533329
|
Corrected Flop count in Benchmark su3 and expanded the Wilson flow output
|
2018-04-24 01:19:53 -07:00 |
|
paboyle
|
b5510427f9
|
physical fermion interface, cshift benchmark in SU3.
|
2018-04-18 01:43:29 +01:00 |
|
paboyle
|
276f113f28
|
IO uses master boss node for metadata.
|
2018-03-30 16:17:05 +01:00 |
|
paboyle
|
ab6afd18ac
|
Still compile if no LIME
|
2018-03-30 13:39:20 +01:00 |
|
|
c5a885dcd6
|
I/O benchmark
|
2018-03-29 19:57:41 +01:00 |
|
Peter Boyle
|
6fe9b28a82
|
Cosmetic
|
2018-03-24 19:27:14 -04:00 |
|
Peter Boyle
|
b002587d7c
|
Simplify
|
2018-03-24 19:26:44 -04:00 |
|
Peter Boyle
|
6c08385782
|
Simplify
|
2018-03-24 19:26:19 -04:00 |
|
Peter Boyle
|
a3690071b4
|
Warm up GPu
|
2018-03-22 18:05:20 -04:00 |
|
Peter Boyle
|
5ac96dbdc6
|
Warm behaviour in SU3 benchmark
|
2018-03-20 07:18:31 -04:00 |
|
paboyle
|
aead94e9a7
|
View introduced
|
2018-03-04 16:39:29 +00:00 |
|
paboyle
|
36ea5f6b77
|
gpu friendly coordinates ; no std::vector on GPU
|
2018-02-24 22:20:14 +00:00 |
|
Guido Cossu
|
fb24e3a7d2
|
Adding utilities for perf profiling
|
2018-01-29 11:11:45 +01:00 |
|
paboyle
|
604c05f4b8
|
parallel_for elimination -> thread_loop
|
2018-01-28 01:01:36 +00:00 |
|
paboyle
|
ce4da83bc2
|
Zero changes, literally
|
2018-01-27 23:51:10 +00:00 |
|
paboyle
|
c4f82e072b
|
_grid becomes private ; use Grid()§
|
2018-01-27 00:04:12 +00:00 |
|
paboyle
|
2a4a0e43c1
|
Hide internals
|
2018-01-26 23:08:27 +00:00 |
|
paboyle
|
f4010023ca
|
Warning fixes
|
2018-01-25 23:46:47 +00:00 |
|
paboyle
|
e7cba358c2
|
Temporary update to reflect the new dropping of std::vector in Lattice
Will update again to hide the internals in an interface
|
2018-01-25 23:31:41 +00:00 |
|
Guido Cossu
|
cff3bae155
|
Adding support for general Nc in the benchmark outputs
|
2018-01-25 13:46:31 +01:00 |
|
paboyle
|
918c105c57
|
NVCC warning elimination
|
2018-01-24 13:23:59 +00:00 |
|
paboyle
|
d74c21a386
|
GLobal edit for QCD namespace removal & NAMESPACE macros
|
2018-01-15 09:37:58 +00:00 |
|
paboyle
|
9b32d51cd1
|
Simplify comms layer proliferatoin
|
2018-01-08 11:27:14 +00:00 |
|
paboyle
|
4f8b6f26b4
|
Merge branch 'develop' into feature/dwf-multirhs
|
2017-10-02 11:41:49 +01:00 |
|
Peter Boyle
|
bfb68e6f02
|
Merge pull request #130 from giltirn/gparity-handunroll
Gparity handunroll
|
2017-09-21 10:11:00 +01:00 |
|
paboyle
|
17c5b0f152
|
Patching comparison point
|
2017-09-16 18:18:07 +01:00 |
|
Peter Boyle
|
b331be9101
|
Better reporting
|
2017-08-31 11:32:57 +01:00 |
|
Peter Boyle
|
49c20a9fa8
|
Patch to reporting
|
2017-08-31 11:32:21 +01:00 |
|
paboyle
|
7359df3501
|
Full reporting for benchmark; save robustness factor
|
2017-08-31 10:42:35 +01:00 |
|
Christopher Kelly
|
d36d2fb40d
|
Added ability to override default Ls in Benchmark_dwf
|
2017-08-28 06:53:56 -07:00 |
|
Peter Boyle
|
5b9267e88d
|
Cleaner comms benchmark treatment for one node runs
|
2017-08-27 18:24:48 -04:00 |
|
paboyle
|
15fd4003ef
|
Improving presentation of results
|
2017-08-27 13:46:02 +01:00 |
|
paboyle
|
ad89abb018
|
Fix
|
2017-08-25 20:43:37 +01:00 |
|
paboyle
|
80c5bce5bb
|
Merge branch 'develop' into feature/multi-communicator
|
2017-08-25 20:21:26 +01:00 |
|
Peter Boyle
|
d0f3d525d5
|
Optimal block size for KNL
|
2017-08-25 19:33:54 +01:00 |
|
Peter Boyle
|
3a58217405
|
Updated
|
2017-08-25 14:29:53 +01:00 |
|
Peter Boyle
|
c289699d9a
|
updated from cambridge mpi3 shakeout
|
2017-08-25 11:41:01 +01:00 |
|
Peter Boyle
|
c3b1263e75
|
Benchmark prep
|
2017-08-25 09:25:54 +01:00 |
|
Christopher Kelly
|
edabb3577f
|
Imported Benchmark_gparity
|
2017-08-23 16:54:06 -04:00 |
|
paboyle
|
ae56e556c6
|
finalise issue on new OPA revert
|
2017-08-20 02:53:12 +01:00 |
|
paboyle
|
383ca7d392
|
Switch off comms for now until feature/multi-communicator is merged
|
2017-08-20 01:27:48 +01:00 |
|
paboyle
|
a446d95c33
|
Trying to pass TeamCity and Travis
|
2017-08-20 01:10:50 +01:00 |
|
paboyle
|
be66e7dd95
|
Merge branch 'develop' into feature/multi-communicator
|
2017-08-19 23:12:38 +01:00 |
|
paboyle
|
bfef525ed2
|
New benchmark prep
|
2017-08-19 23:10:12 +01:00 |
|
Peter Boyle
|
7d88198387
|
Merge branch 'develop' into feature/multi-communicator
|
2017-08-19 13:03:35 -04:00 |
|
Peter Boyle
|
9e658de238
|
Use Vector
|
2017-08-19 12:52:44 -04:00 |
|
Peter Boyle
|
14d53e1c9e
|
Threaded MPI calls patches
|
2017-07-29 13:08:10 -04:00 |
|
Peter Boyle
|
40e119c61c
|
NUMA improvements worth preserving from AMD EPYC tests
|
2017-07-08 22:27:11 -04:00 |
|
Peter Boyle
|
b73bd151bb
|
Switch off counters by default
|
2017-06-30 10:16:35 +01:00 |
|
Peter Boyle
|
694b305cab
|
Update to reporting
|
2017-06-30 10:16:13 +01:00 |
|
paboyle
|
6f5a5cd9b3
|
Improved threaded comms benchmark
|
2017-06-28 23:27:02 +01:00 |
|
Peter Boyle
|
08e04b9676
|
Better benchmarks
|
2017-06-28 15:30:06 +01:00 |
|
paboyle
|
54e94360ad
|
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
|
2017-06-24 23:10:24 +01:00 |
|
paboyle
|
6ebf9f15b7
|
Splitting communicators first cut
|
2017-06-22 08:14:34 +01:00 |
|
paboyle
|
3bfd1f13e6
|
I/O improvements
|
2017-06-11 23:14:10 +01:00 |
|
Peter Boyle
|
725c513d94
|
Better MPI3 benchmarking
|
2017-05-29 16:47:32 -04:00 |
|
Guido Cossu
|
0ffc235741
|
Adding more statistics to the Benchmark_comms. Min and max
|
2017-05-19 10:55:04 +01:00 |
|
Guido Cossu
|
8e19c99c7d
|
Adding more statistical info in the Benchmark_comms
|
2017-05-18 19:07:35 +01:00 |
|
Guido Cossu
|
a0bc0ad06f
|
Reverting change in Bechmark_comms. Keeping 300 iterations
|
2017-05-18 17:48:11 +01:00 |
|
Guido Cossu
|
bc862ce3ab
|
Fixing an allocation issue in Benchmark_comms
|
2017-05-18 14:44:56 +01:00 |
|
paboyle
|
751f2b9703
|
Better check and benchmark driving
|
2017-05-05 19:54:38 +01:00 |
|
Guido Cossu
|
20999c1370
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-05-05 12:47:17 +01:00 |
|
Peter Boyle
|
945767c6d8
|
More info
|
2017-05-03 20:26:35 -04:00 |
|
Peter Boyle
|
92e364a35f
|
Better reporting in benchmark for MPI3
|
2017-05-03 15:43:36 -04:00 |
|
Guido Cossu
|
4063238943
|
Adding HMC test file example for Mobius + smearing
|
2017-05-01 13:44:00 +01:00 |
|
Guido Cossu
|
3344788fa1
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-05-01 12:13:56 +01:00 |
|
paboyle
|
738c1a11c2
|
longer nloop
|
2017-04-26 08:43:20 +01:00 |
|
paboyle
|
ab66bac4e6
|
Think I'm getting on top of the reduced cost exterior precomputed list of links
|
2017-04-25 08:50:26 +01:00 |
|
paboyle
|
c429ace748
|
Cleaner OpenMP use
|
2017-04-22 20:28:42 +01:00 |
|
Peter Boyle
|
1d1b225497
|
Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).
|
2017-04-22 09:05:28 -04:00 |
|
paboyle
|
fc4ab9ccd5
|
Working half precision comms
|
2017-04-20 11:20:26 +01:00 |
|
Guido Cossu
|
8c540333d5
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-04-05 14:41:04 +01:00 |
|
paboyle
|
f18f5ed926
|
Drop random device
|
2017-04-02 00:26:26 +09:00 |
|
paboyle
|
4b17e8eba8
|
Merge branch 'develop' into feature/bgq-asm
Conflicts:
lib/qcd/action/fermion/Fermion.h
lib/qcd/action/fermion/WilsonFermion.cc
lib/util/Init.cc
tests/Test_cayley_even_odd_vec.cc
|
2017-03-28 04:49:30 -04:00 |
|
paboyle
|
18bde08d1b
|
Merge branch 'feature/staggering' into develop
|
2017-03-28 15:25:55 +09:00 |
|
paboyle
|
e099dcdae7
|
Merge branch 'develop' into feature/bgq-asm
|
2017-02-23 00:25:29 +00:00 |
|
azusayamaguchi
|
1c30e9a961
|
Verified
|
2017-02-21 23:01:25 +00:00 |
|
paboyle
|
3ae92fa2e6
|
Global changes to parallel_for structure.
Move the comms flags to more sensible names
|
2017-02-21 05:24:27 -05:00 |
|
paboyle
|
1a30455a10
|
1000 iters on bmark for more accurate timing
|
2017-02-20 17:47:01 -05:00 |
|
paboyle
|
aca7a3ef0a
|
Optimisation control improvements
|
2017-02-10 18:22:31 -05:00 |
|
Guido Cossu
|
8b6a6c8236
|
Resolving small merge conflict
|
2017-02-09 16:20:24 +00:00 |
|
Guido Cossu
|
e0571c872b
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-02-09 16:12:00 +00:00 |
|
paboyle
|
2bf4688e83
|
Running on BNL KNL
|
2017-02-07 01:32:10 -05:00 |
|
paboyle
|
060da786e9
|
Comms benchmark improvements
|
2017-02-07 01:07:39 -05:00 |
|
Guido Cossu
|
17629b8d9e
|
Merge branch 'develop' into feature/hmc_generalise
|
2017-01-25 11:33:53 +00:00 |
|
|
a37e71f362
|
New automatic implementation of gamma matrices, Meson and SeqGamma are broken
|
2017-01-23 19:13:43 -08:00 |
|
azusayamaguchi
|
05c1924819
|
Timing loop change
|
2017-01-23 10:43:45 +00:00 |
|
Peter Boyle
|
55cb22ad67
|
Z mobius bmark
|
2016-12-18 00:55:37 +00:00 |
|
Guido Cossu
|
0bd296dda4
|
Adding check of the Dag part in the benchmark
|
2016-12-14 03:15:09 +00:00 |
|
Peter Boyle
|
ff71a8e847
|
Ready for sim
|
2016-12-08 17:00:32 +00:00 |
|
Peter Boyle
|
e27c6b217c
|
Updating
|
2016-12-01 12:42:53 +00:00 |
|