paboyle
|
8b99d80d8c
|
Merge branch 'bgq-asm-shmemfixes' into feature/bgq-asm
|
2017-03-12 23:30:09 +00:00 |
|
paboyle
|
3901b17ade
|
timeings from BNL
|
2017-02-28 17:06:45 -05:00 |
|
paboyle
|
af230a1fb8
|
Average the time across the whole machine for outliers
|
2017-02-28 17:05:22 -05:00 |
|
Christopher Kelly
|
06a132e3f9
|
Fixes to SHMEM comms
|
2017-02-28 13:31:54 -08:00 |
|
paboyle
|
96d44d5c55
|
Header fix
|
2017-02-24 19:12:11 -05:00 |
|
Lanny91
|
7fe797daf8
|
SIMD vector length sanity checks
|
2017-02-23 16:49:44 +00:00 |
|
Lanny91
|
486a01294a
|
Corrected QPX SIMD width
|
2017-02-23 16:47:56 +00:00 |
|
paboyle
|
586a7c90b7
|
Merge branch 'develop' into feature/bgq-asm
|
2017-02-23 00:26:59 +00:00 |
|
paboyle
|
e099dcdae7
|
Merge branch 'develop' into feature/bgq-asm
|
2017-02-23 00:25:29 +00:00 |
|
paboyle
|
4e7ab3166f
|
Refactoring header layout
|
2017-02-22 18:09:33 +00:00 |
|
paboyle
|
aac80cbb44
|
Bug fix from Chris K
|
2017-02-22 12:19:09 -05:00 |
|
Lanny91
|
c80948411b
|
Added tRotate function and MaddRealPart struct for generic SIMD, bugfix in MultRealPart and minor cosmetic changes.
|
2017-02-22 14:57:10 +00:00 |
|
Lanny91
|
95625a7bd1
|
Use Grid Integer type
|
2017-02-22 13:09:32 +00:00 |
|
Lanny91
|
0796696733
|
Emulated integer vector type for QPX and generic SIMD instruction sets.
|
2017-02-22 12:01:36 +00:00 |
|
Peter Boyle
|
cc773ae70c
|
Merge pull request #89 from sunpho84/prepend_package_with_grid
Prepending PACKAGE_ with GRID_ in Config.h
|
2017-02-22 00:52:10 +00:00 |
|
Peter Boyle
|
d21c51b9be
|
Merge pull request #88 from sunpho84/pickpoketting
now it is possible to pass {coords list} to a peek or poke
|
2017-02-22 00:51:33 +00:00 |
|
Peter Boyle
|
597a7b4b3a
|
Merge pull request #81 from edbennett/develop
Fix misleading message: "doxygen-pdf requires doxygen-pdf"
|
2017-02-22 00:50:59 +00:00 |
|
azusayamaguchi
|
1c30e9a961
|
Verified
|
2017-02-21 23:01:25 +00:00 |
|
Francesco Sanfilippo
|
041884acf0
|
Prepending PACKAGE_ with GRID_ in Config.h
Avoid polluting linking progr
|
2017-02-21 22:51:36 +01:00 |
|
Francesco Sanfilippo
|
15e668eef1
|
now it is possible to pass {coords list} to a peek or poke
|
2017-02-21 22:48:38 +01:00 |
|
azusayamaguchi
|
bf7e3f20d4
|
Staggaered fermion optimised version
|
2017-02-21 14:35:42 +00:00 |
|
paboyle
|
3ae92fa2e6
|
Global changes to parallel_for structure.
Move the comms flags to more sensible names
|
2017-02-21 05:24:27 -05:00 |
|
paboyle
|
3906cd2149
|
Stencil fix on BNL KNL system
|
2017-02-20 17:51:31 -05:00 |
|
paboyle
|
5a1fb29db7
|
Useful debug code info to preserve
|
2017-02-20 17:49:23 -05:00 |
|
paboyle
|
661fc4d3d1
|
Debug AVX512 exchange code paths
|
2017-02-20 17:48:36 -05:00 |
|
paboyle
|
41009cc142
|
Move excange into the stencil only; keep Cshift fully general
|
2017-02-20 17:48:04 -05:00 |
|
paboyle
|
37720c4db7
|
Count bytes off node only
|
2017-02-20 17:47:40 -05:00 |
|
paboyle
|
1a30455a10
|
1000 iters on bmark for more accurate timing
|
2017-02-20 17:47:01 -05:00 |
|
paboyle
|
cd0da81196
|
Merge branch 'feature/bgq-asm' of https://github.com/paboyle/Grid into feature/bgq-asm
|
2017-02-16 18:52:30 -05:00 |
|
paboyle
|
f246fe3304
|
Improvements to avx for invertible to avoid latent bug
|
2017-02-16 23:52:44 +00:00 |
|
paboyle
|
8a29c16bde
|
Faster gather exchange
|
2017-02-16 23:52:22 +00:00 |
|
paboyle
|
d68907fc3e
|
Debug temp
|
2017-02-16 18:51:35 -05:00 |
|
paboyle
|
5c0adf7bf2
|
Make clang happy with parenthesis
|
2017-02-16 23:51:33 +00:00 |
|
paboyle
|
be3a8249c6
|
Faster gather
|
2017-02-16 23:51:15 +00:00 |
|
paboyle
|
bd600702cf
|
Vectorise the XYZT face gathering better.
Hard coded for simd_layout <= 2 in any given spread out direction; full generality is inconsistent
with efficiency.
|
2017-02-15 11:11:04 +00:00 |
|
paboyle
|
aca7a3ef0a
|
Optimisation control improvements
|
2017-02-10 18:22:31 -05:00 |
|
paboyle
|
2c246551d0
|
Overlap comms and compute options in wilson kernels
|
2017-02-07 01:37:10 -05:00 |
|
paboyle
|
71ac2e7940
|
Faster RNG init
|
2017-02-07 01:33:23 -05:00 |
|
paboyle
|
2bf4688e83
|
Running on BNL KNL
|
2017-02-07 01:32:10 -05:00 |
|
paboyle
|
a48ee6f0f2
|
Don't use MPI3_leader any more. No real gain and complex
|
2017-02-07 01:31:24 -05:00 |
|
paboyle
|
73547cca66
|
MPI3 working i think
|
2017-02-07 01:30:02 -05:00 |
|
paboyle
|
123c673db7
|
Policy to control async or sync SendRecv
|
2017-02-07 01:24:54 -05:00 |
|
paboyle
|
61f82216e2
|
Communicator Policy, NodeCount distinct from Rank count
|
2017-02-07 01:22:53 -05:00 |
|
paboyle
|
8e7ca92278
|
Debugged cshift case
|
2017-02-07 01:21:32 -05:00 |
|
paboyle
|
485ad6fde0
|
Stencil working in SHM MPI3
|
2017-02-07 01:20:39 -05:00 |
|
paboyle
|
6ea2184e18
|
OMP define change
|
2017-02-07 01:17:16 -05:00 |
|
paboyle
|
fdc170b8a3
|
Parallel fors in lattice transfer
|
2017-02-07 01:16:39 -05:00 |
|
paboyle
|
060da786e9
|
Comms benchmark improvements
|
2017-02-07 01:07:39 -05:00 |
|
paboyle
|
85c7bc4321
|
Bug fixes for cases that physics code couldn't hit but latent
and discovered on KNL (long vector, y SIMD dir) and checker dir set to y.
Remove the assertions on these code paths now they are tested.
|
2017-02-07 01:01:15 -05:00 |
|
paboyle
|
0883d6a7ce
|
Overlap comms compute support; make reg naming consistent with bgq aasm
|
2017-02-07 00:59:32 -05:00 |
|