1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-15 02:05:37 +00:00
Commit Graph

224 Commits

Author SHA1 Message Date
Peter Boyle
7d88198387 Merge branch 'develop' into feature/multi-communicator 2017-08-19 13:03:35 -04:00
Peter Boyle
9e658de238 Use Vector 2017-08-19 12:52:44 -04:00
Peter Boyle
14d53e1c9e Threaded MPI calls patches 2017-07-29 13:08:10 -04:00
Peter Boyle
40e119c61c NUMA improvements worth preserving from AMD EPYC tests 2017-07-08 22:27:11 -04:00
Peter Boyle
b73bd151bb Switch off counters by default 2017-06-30 10:16:35 +01:00
Peter Boyle
694b305cab Update to reporting 2017-06-30 10:16:13 +01:00
paboyle
6f5a5cd9b3 Improved threaded comms benchmark 2017-06-28 23:27:02 +01:00
Peter Boyle
08e04b9676 Better benchmarks 2017-06-28 15:30:06 +01:00
paboyle
54e94360ad Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit 2017-06-24 23:10:24 +01:00
paboyle
6ebf9f15b7 Splitting communicators first cut 2017-06-22 08:14:34 +01:00
paboyle
3bfd1f13e6 I/O improvements 2017-06-11 23:14:10 +01:00
Peter Boyle
725c513d94 Better MPI3 benchmarking 2017-05-29 16:47:32 -04:00
Guido Cossu
0ffc235741 Adding more statistics to the Benchmark_comms. Min and max 2017-05-19 10:55:04 +01:00
Guido Cossu
8e19c99c7d Adding more statistical info in the Benchmark_comms 2017-05-18 19:07:35 +01:00
Guido Cossu
a0bc0ad06f Reverting change in Bechmark_comms. Keeping 300 iterations 2017-05-18 17:48:11 +01:00
Guido Cossu
bc862ce3ab Fixing an allocation issue in Benchmark_comms 2017-05-18 14:44:56 +01:00
paboyle
751f2b9703 Better check and benchmark driving 2017-05-05 19:54:38 +01:00
Guido Cossu
20999c1370 Merge branch 'develop' into feature/hmc_generalise 2017-05-05 12:47:17 +01:00
Peter Boyle
945767c6d8 More info 2017-05-03 20:26:35 -04:00
Peter Boyle
92e364a35f Better reporting in benchmark for MPI3 2017-05-03 15:43:36 -04:00
Guido Cossu
4063238943 Adding HMC test file example for Mobius + smearing 2017-05-01 13:44:00 +01:00
Guido Cossu
3344788fa1 Merge branch 'develop' into feature/hmc_generalise 2017-05-01 12:13:56 +01:00
paboyle
738c1a11c2 longer nloop 2017-04-26 08:43:20 +01:00
paboyle
ab66bac4e6 Think I'm getting on top of the reduced cost exterior precomputed list of links 2017-04-25 08:50:26 +01:00
paboyle
c429ace748 Cleaner OpenMP use 2017-04-22 20:28:42 +01:00
Peter Boyle
1d1b225497 Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). 2017-04-22 09:05:28 -04:00
paboyle
fc4ab9ccd5 Working half precision comms 2017-04-20 11:20:26 +01:00
Guido Cossu
8c540333d5 Merge branch 'develop' into feature/hmc_generalise 2017-04-05 14:41:04 +01:00
paboyle
f18f5ed926 Drop random device 2017-04-02 00:26:26 +09:00
paboyle
4b17e8eba8 Merge branch 'develop' into feature/bgq-asm
Conflicts:
	lib/qcd/action/fermion/Fermion.h
	lib/qcd/action/fermion/WilsonFermion.cc
	lib/util/Init.cc
	tests/Test_cayley_even_odd_vec.cc
2017-03-28 04:49:30 -04:00
paboyle
18bde08d1b Merge branch 'feature/staggering' into develop 2017-03-28 15:25:55 +09:00
paboyle
e099dcdae7 Merge branch 'develop' into feature/bgq-asm 2017-02-23 00:25:29 +00:00
azusayamaguchi
1c30e9a961 Verified 2017-02-21 23:01:25 +00:00
paboyle
3ae92fa2e6 Global changes to parallel_for structure.
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
paboyle
1a30455a10 1000 iters on bmark for more accurate timing 2017-02-20 17:47:01 -05:00
paboyle
aca7a3ef0a Optimisation control improvements 2017-02-10 18:22:31 -05:00
Guido Cossu
8b6a6c8236 Resolving small merge conflict 2017-02-09 16:20:24 +00:00
Guido Cossu
e0571c872b Merge branch 'develop' into feature/hmc_generalise 2017-02-09 16:12:00 +00:00
paboyle
2bf4688e83 Running on BNL KNL 2017-02-07 01:32:10 -05:00
paboyle
060da786e9 Comms benchmark improvements 2017-02-07 01:07:39 -05:00
Guido Cossu
17629b8d9e Merge branch 'develop' into feature/hmc_generalise 2017-01-25 11:33:53 +00:00
a37e71f362 New automatic implementation of gamma matrices, Meson and SeqGamma are broken 2017-01-23 19:13:43 -08:00
azusayamaguchi
05c1924819 Timing loop change 2017-01-23 10:43:45 +00:00
Peter Boyle
55cb22ad67 Z mobius bmark 2016-12-18 00:55:37 +00:00
Guido Cossu
0bd296dda4 Adding check of the Dag part in the benchmark 2016-12-14 03:15:09 +00:00
Peter Boyle
ff71a8e847 Ready for sim 2016-12-08 17:00:32 +00:00
Peter Boyle
e27c6b217c Updating 2016-12-01 12:42:53 +00:00
Peter Boyle
cd01c1dbe9 Ls 16 more relevant 2016-11-30 22:11:10 +00:00
paboyle
bd0430b34f Serialisation in malloc fixed 2016-11-29 22:27:55 +00:00
Azusa Yamaguchi
c097fd041a Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-29 13:44:17 +00:00
Azusa Yamaguchi
389e0a77bd Staggerd Fermion 5D 2016-11-29 13:13:56 +00:00
paboyle
2f92b4860b Test the full Mooee sector 2016-11-29 00:15:08 +00:00
Azusa Yamaguchi
95f43d27ae Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-22 13:49:22 +00:00
Azusa Yamaguchi
668ca57702 Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering 2016-11-22 13:49:11 +00:00
433afd36f5 Makefile rule for simple_* objects 2016-11-19 01:33:13 +01:00
042ae5b87c generic 256bits SIMD 2016-11-15 12:16:15 +00:00
paboyle
33dc1f51b5 Final sign off commits from Cori-1 2016-11-09 04:11:03 -08:00
Azusa Yamaguchi
ee686a7d85 Compiles now 2016-11-03 16:58:23 +00:00
Azusa Yamaguchi
1c5b7a6be5 Staggered phases first cut, c1, c2, u0 2016-11-03 16:26:56 +00:00
paboyle
757a928f9a Improvement to use own SHM_OPEN call to avoid openmpi bug. 2016-11-02 12:37:46 +00:00
paboyle
bb94ddd0eb Tidy up of mpi3; also some cleaning of the dslash controls. 2016-11-02 08:07:09 +00:00
paboyle
791cb050c8 Comms improvements 2016-11-01 11:35:43 +00:00
azusayamaguchi
b6a65059a2 Update to use shared memory to contain the stencil comms buffers
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
azusayamaguchi
c190221fd3 Internal SHM comms in non-simd directions working
Need to fix simd directions
2016-10-22 18:14:27 +01:00
paboyle
a762b1fb71 MPI3 working with a bounce through shared memory on my laptop.
Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node.
2016-10-21 09:03:26 +01:00
azusayamaguchi
81f2aeaece KNL streaming stores, and KNL performance coutners 2016-10-12 11:45:22 +01:00
Guido Cossu
2e453dfbf5 Added some instrumentation to benchmark the force computation 2016-10-06 17:52:45 +01:00
paboyle
4089984431 Timing hooks 2016-10-06 09:25:12 +01:00
Guido Cossu
0fd179fb33 Merge branch 'develop' into feature/hirep 2016-09-01 12:59:53 +01:00
Guido Cossu
fd5614738d Merge branch 'develop' into feature/hirep 2016-08-30 18:21:36 +01:00
paboyle
5a68715be3 Richards sweep test 2016-08-05 10:51:57 +01:00
paboyle
32bc7a6ab8 MPI back out of change that hangs
AVX2 for clang, gcc needs the -mfma flag.
2016-08-05 10:36:00 +01:00
b65e72e521 Merge pull request #43 from rprollins/bench/output-format
Benchmark_dwf_sweep and Benchmark_zmm output formats
2016-08-04 16:47:01 +01:00
629283726b build system: local Grid link flag moved to configure.ac 2016-08-03 15:07:42 +01:00
9e5b934d21 improved LAPACK configuration 2016-08-02 17:26:54 +01:00
e9f30cab2c first working version for the new build system 2016-07-30 17:53:18 +01:00
Richard Rollins
df6c9f55d1 Use common benchmark output format for dwf_sweep and zmm 2016-07-20 17:38:56 +01:00
paboyle
f4dd5062d7 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2016-07-15 19:26:06 +01:00
paboyle
9db2c6525d updating benchmarks for red black 4d for Ls vectorised code 2016-07-14 23:44:02 +01:00
paboyle
ef97e32152 Adding persistent communicators 2016-07-08 17:16:08 +01:00
Guido Cossu
5028969d4b Added generators for the adjoint representation 2016-07-08 15:40:11 +01:00
paboyle
a0676beeb1 Open up dependency on Eigen and FFTW 2016-07-07 22:31:07 +01:00
Guido Cossu
fdfbf11c6d Merge branch 'develop' into temporary-smearing 2016-07-04 18:45:10 +01:00
Guido Cossu
9cb90f714e Merge remote-tracking branch 'origin/develop' into temporary-smearing 2016-07-04 17:28:40 +01:00
paboyle
bfe14000a9 Double compile fix 2016-07-01 16:33:51 +01:00
paboyle
680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
paboyle
2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle
51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu
565e9329ba Changed the colouring classes 2016-06-30 16:51:03 +01:00
Guido Cossu
5e02392f9c Fixed compilation error for benchmark_dwf
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle
55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
paboyle
05acc22920 placeholder for non temporal loads optimisation 2016-06-07 13:18:21 -07:00
paboyle
8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle
786ca52c43 Problems remain in the red black preconditioning of the Ls vectorisation 2016-06-06 07:05:51 -07:00
paboyle
53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle
139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
paboyle
f2ae9682ff Remove some timing hacks 2016-04-19 15:14:32 -07:00
paboyle
528eb773ad Merged.
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle
c323425496 Small change 2016-04-11 10:38:43 +01:00