1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-19 18:21:02 +01:00
Commit Graph

2322 Commits

Author SHA1 Message Date
paboyle 41009cc142 Move excange into the stencil only; keep Cshift fully general 2017-02-20 17:48:04 -05:00
paboyle 37720c4db7 Count bytes off node only 2017-02-20 17:47:40 -05:00
paboyle 1a30455a10 1000 iters on bmark for more accurate timing 2017-02-20 17:47:01 -05:00
paboyle cd0da81196 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2017-02-16 18:52:30 -05:00
paboyle f246fe3304 Improvements to avx for invertible to avoid latent bug 2017-02-16 23:52:44 +00:00
paboyle 8a29c16bde Faster gather exchange 2017-02-16 23:52:22 +00:00
paboyle d68907fc3e Debug temp 2017-02-16 18:51:35 -05:00
paboyle 5c0adf7bf2 Make clang happy with parenthesis 2017-02-16 23:51:33 +00:00
paboyle be3a8249c6 Faster gather 2017-02-16 23:51:15 +00:00
paboyle bd600702cf Vectorise the XYZT face gathering better.
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
2017-02-15 11:11:04 +00:00
paboyle aca7a3ef0a Optimisation control improvements 2017-02-10 18:22:31 -05:00
paboyle 2c246551d0 Overlap comms and compute options in wilson kernels 2017-02-07 01:37:10 -05:00
paboyle 71ac2e7940 Faster RNG init 2017-02-07 01:33:23 -05:00
paboyle 2bf4688e83 Running on BNL KNL 2017-02-07 01:32:10 -05:00
paboyle a48ee6f0f2 Don't use MPI3_leader any more. No real gain and complex 2017-02-07 01:31:24 -05:00
paboyle 73547cca66 MPI3 working i think 2017-02-07 01:30:02 -05:00
paboyle 123c673db7 Policy to control async or sync SendRecv 2017-02-07 01:24:54 -05:00
paboyle 61f82216e2 Communicator Policy, NodeCount distinct from Rank count 2017-02-07 01:22:53 -05:00
paboyle 8e7ca92278 Debugged cshift case 2017-02-07 01:21:32 -05:00
paboyle 485ad6fde0 Stencil working in SHM MPI3 2017-02-07 01:20:39 -05:00
paboyle 6ea2184e18 OMP define change 2017-02-07 01:17:16 -05:00
paboyle fdc170b8a3 Parallel fors in lattice transfer 2017-02-07 01:16:39 -05:00
paboyle 060da786e9 Comms benchmark improvements 2017-02-07 01:07:39 -05:00
paboyle 85c7bc4321 Bug fixes for cases that physics code couldn't hit but latent
and discovered on KNL (long vector, y SIMD dir) and checker dir set to y.
Remove the assertions on these code paths now they are tested.
2017-02-07 01:01:15 -05:00
paboyle 0883d6a7ce Overlap comms compute support; make reg naming consistent with bgq aasm 2017-02-07 00:59:32 -05:00
paboyle 9ff97b4711 Improved stencil tests passing all on KNL multinode 2017-02-07 00:58:34 -05:00
paboyle b5e9c900a4 Better printing and signal handling options 2017-02-07 00:57:55 -05:00
paboyle 4bbdfb434c Overlap comms compute modifications 2017-02-07 00:57:01 -05:00
Peter Boyle c3b6d573b9 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2016-12-30 22:42:17 +00:00
Peter Boyle 1e179c903d Worried about integer; suspect where statements are broken 2016-12-27 17:46:38 +00:00
Peter Boyle 669cfca9b7 No inline 2016-12-27 17:45:40 +00:00
Peter Boyle ff2f559a57 Remove inline on gather optimised path 2016-12-27 17:45:19 +00:00
Peter Boyle 03c81bd902 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2016-12-27 11:25:35 +00:00
Peter Boyle a869addef1 Stats switch off 2016-12-27 11:25:22 +00:00
Peter Boyle 1caa3fbc2d LOCK UNLOCK only 2016-12-27 11:24:45 +00:00
Peter Boyle 3d21297bbb Call the fast path compressor for wilson kernels to avoid if else on projector 2016-12-27 11:23:13 +00:00
Peter Boyle 25efefc5b4 Back to original thread policy post test 2016-12-23 09:49:04 +00:00
Peter Boyle eabf316ed9 BGQ performance ASM 2016-12-22 21:56:08 +00:00
Peter Boyle 04ae7929a3 BGQ or KNL assembler now 2016-12-22 17:53:22 +00:00
Peter Boyle caba0d42a5 L1p controls 2016-12-22 17:52:55 +00:00
Peter Boyle 9ae81c06d2 L1p controls for BG/Q 2016-12-22 17:52:21 +00:00
Peter Boyle 0903c48caa Hot start SU3 2016-12-22 17:51:45 +00:00
Peter Boyle 7dc36628a1 QPX finishing 2016-12-22 17:50:48 +00:00
Peter Boyle b8cdb3e90a Debug hack; raises from 62GF/s to 72 GF/s per node on BG/Q 2016-12-22 17:50:14 +00:00
Peter Boyle 5241245534 Default to static scheduling 2016-12-22 17:49:21 +00:00
Dr Peter Boyle 960316e207 type conversion in printf 2016-12-22 17:27:01 +00:00
paboyle 3f2d53a994 BGQ assembler beginning 2016-12-20 10:21:26 +00:00
paboyle 8a337f3070 Move cayley into mainstream tests 2016-12-18 02:35:31 +00:00
paboyle a59f5374d7 Evade warning 2016-12-18 02:23:55 +00:00
paboyle 4b220972ac Warning fix 2016-12-18 02:14:17 +00:00