1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-27 05:56:01 +01:00
Commit Graph

2407 Commits

Author SHA1 Message Date
paboyle fc93f0b2ec Save some code for static huge tlb's. It is ifdef'ed out but an interesting root only experiment.
No gain from it.
2017-03-21 22:30:29 -04:00
paboyle 8c8473998d Average over whole cluster the comm time. 2017-03-21 22:29:51 -04:00
paboyle e7c36771ed ZMobius prep for asm 2017-03-15 14:23:33 -04:00
paboyle 8dc57a1e25 Layout change 2017-03-13 11:11:46 +00:00
paboyle f57bd770b0 Merge branch 'bugfix/dminus' into feature/bgq-asm 2017-03-13 11:11:03 +00:00
paboyle 4ed10a3d06 Merge branch 'develop' into feature/bgq-asm 2017-03-13 11:10:10 +00:00
Peter Boyle dfefc70b57 Merge pull request #93 from Lanny91/hotfix/qpx
Some fixes for QPX and generic SIMD types.
2017-03-13 09:31:26 +00:00
Chulwoo Jung 0b61f75c9e Adding ZMobius CG test 2017-03-13 00:12:43 -04:00
Chulwoo Jung 33edde245d Changing Dminus(Dag) to use full vectors to work correctly 2017-03-12 23:02:42 -04:00
paboyle b64e004555 MPI run fail on macos 2017-03-13 01:59:01 +00:00
paboyle 447c5e6cd7 Z mobius hermiticity correction 2017-03-13 01:30:43 +00:00
paboyle 8b99d80d8c Merge branch 'bgq-asm-shmemfixes' into feature/bgq-asm 2017-03-12 23:30:09 +00:00
paboyle 3901b17ade timeings from BNL 2017-02-28 17:06:45 -05:00
paboyle af230a1fb8 Average the time across the whole machine for outliers 2017-02-28 17:05:22 -05:00
Christopher Kelly 06a132e3f9 Fixes to SHMEM comms 2017-02-28 13:31:54 -08:00
paboyle 96d44d5c55 Header fix 2017-02-24 19:12:11 -05:00
Lanny91 7fe797daf8 SIMD vector length sanity checks 2017-02-23 16:49:44 +00:00
Lanny91 486a01294a Corrected QPX SIMD width 2017-02-23 16:47:56 +00:00
paboyle 586a7c90b7 Merge branch 'develop' into feature/bgq-asm 2017-02-23 00:26:59 +00:00
paboyle e099dcdae7 Merge branch 'develop' into feature/bgq-asm 2017-02-23 00:25:29 +00:00
paboyle 4e7ab3166f Refactoring header layout 2017-02-22 18:09:33 +00:00
paboyle aac80cbb44 Bug fix from Chris K 2017-02-22 12:19:09 -05:00
Lanny91 c80948411b Added tRotate function and MaddRealPart struct for generic SIMD, bugfix in MultRealPart and minor cosmetic changes. 2017-02-22 14:57:10 +00:00
Lanny91 95625a7bd1 Use Grid Integer type 2017-02-22 13:09:32 +00:00
Lanny91 0796696733 Emulated integer vector type for QPX and generic SIMD instruction sets. 2017-02-22 12:01:36 +00:00
Peter Boyle cc773ae70c Merge pull request #89 from sunpho84/prepend_package_with_grid
Prepending PACKAGE_ with GRID_ in Config.h
2017-02-22 00:52:10 +00:00
Peter Boyle d21c51b9be Merge pull request #88 from sunpho84/pickpoketting
now it is possible to pass {coords list} to a peek or poke
2017-02-22 00:51:33 +00:00
Peter Boyle 597a7b4b3a Merge pull request #81 from edbennett/develop
Fix misleading message: "doxygen-pdf requires doxygen-pdf"
2017-02-22 00:50:59 +00:00
Francesco Sanfilippo 041884acf0 Prepending PACKAGE_ with GRID_ in Config.h
Avoid polluting linking progr
2017-02-21 22:51:36 +01:00
Francesco Sanfilippo 15e668eef1 now it is possible to pass {coords list} to a peek or poke 2017-02-21 22:48:38 +01:00
paboyle 3ae92fa2e6 Global changes to parallel_for structure.
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
paboyle 3906cd2149 Stencil fix on BNL KNL system 2017-02-20 17:51:31 -05:00
paboyle 5a1fb29db7 Useful debug code info to preserve 2017-02-20 17:49:23 -05:00
paboyle 661fc4d3d1 Debug AVX512 exchange code paths 2017-02-20 17:48:36 -05:00
paboyle 41009cc142 Move excange into the stencil only; keep Cshift fully general 2017-02-20 17:48:04 -05:00
paboyle 37720c4db7 Count bytes off node only 2017-02-20 17:47:40 -05:00
paboyle 1a30455a10 1000 iters on bmark for more accurate timing 2017-02-20 17:47:01 -05:00
paboyle cd0da81196 Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm 2017-02-16 18:52:30 -05:00
paboyle f246fe3304 Improvements to avx for invertible to avoid latent bug 2017-02-16 23:52:44 +00:00
paboyle 8a29c16bde Faster gather exchange 2017-02-16 23:52:22 +00:00
paboyle d68907fc3e Debug temp 2017-02-16 18:51:35 -05:00
paboyle 5c0adf7bf2 Make clang happy with parenthesis 2017-02-16 23:51:33 +00:00
paboyle be3a8249c6 Faster gather 2017-02-16 23:51:15 +00:00
paboyle bd600702cf Vectorise the XYZT face gathering better.
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
2017-02-15 11:11:04 +00:00
paboyle aca7a3ef0a Optimisation control improvements 2017-02-10 18:22:31 -05:00
paboyle 2c246551d0 Overlap comms and compute options in wilson kernels 2017-02-07 01:37:10 -05:00
paboyle 71ac2e7940 Faster RNG init 2017-02-07 01:33:23 -05:00
paboyle 2bf4688e83 Running on BNL KNL 2017-02-07 01:32:10 -05:00
paboyle a48ee6f0f2 Don't use MPI3_leader any more. No real gain and complex 2017-02-07 01:31:24 -05:00
paboyle 73547cca66 MPI3 working i think 2017-02-07 01:30:02 -05:00