679ae98b14
Merge branch 'feature/better-external-library' into develop
2017-05-04 15:42:12 +01:00
paboyle
90f6bc16bb
No compile clang fix
2017-05-04 12:15:06 +01:00
Peter Boyle
9b5b639546
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-03 20:51:40 -04:00
Peter Boyle
945767c6d8
More info
2017-05-03 20:26:35 -04:00
Peter Boyle
422cdf4979
Some checks
2017-05-03 18:37:38 -04:00
Peter Boyle
38db174f3b
Print statement
2017-05-03 18:25:26 -04:00
Peter Boyle
92e364a35f
Better reporting in benchmark for MPI3
2017-05-03 15:43:36 -04:00
58299b8ba2
Git info separated from version in git-config
2017-05-02 20:04:41 +01:00
124bf4d829
git ref in config summary
2017-05-02 19:41:01 +01:00
e8e56b3414
Config summary saved in git-config
2017-05-02 19:40:47 +01:00
89c430136d
grid-config program
2017-05-02 19:13:13 +01:00
ea9aef7baa
New header for standard headers (was an issue with Remez.h and external compilation)
2017-05-02 18:26:11 +01:00
c9e9e8061d
Merge branch 'feature/hadrons' into develop
2017-05-02 18:23:47 +01:00
dda8d77c87
Merge branch 'feature/hadrons' into feature/rare_kaon
2017-05-01 17:50:57 +01:00
aa29f4346a
Hadrons: weird bus error with recent macOS clang
2017-05-01 17:49:08 +01:00
Peter Boyle
99220f6531
Fixes and better timing
2017-04-26 17:24:11 -04:00
paboyle
2a6d093749
move the sudo: required to match locatoin on Guido's branch
2017-04-26 09:15:34 +01:00
paboyle
c947947fad
sudo required suggested by guido
2017-04-26 08:45:36 +01:00
paboyle
f555b50547
Merge branch 'feature/half-prec-comms' into develop
2017-04-26 08:43:40 +01:00
paboyle
738c1a11c2
longer nloop
2017-04-26 08:43:20 +01:00
Peter Boyle
f8797e1e3e
bug fix. works now and great face performance
2017-04-26 03:14:02 -04:00
Peter Boyle
fd1eb7de13
Clean implementation of the exterior faces listing only those points on the boudary
2017-04-26 02:34:52 -04:00
Peter Boyle
2ce898efa3
Pretty code
2017-04-26 02:34:25 -04:00
paboyle
ab66bac4e6
Think I'm getting on top of the reduced cost exterior precomputed list of links
2017-04-25 08:50:26 +01:00
paboyle
56277a11c8
Build a list of whats on the surface
2017-04-24 17:06:15 +01:00
paboyle
916e9e1d3e
Merge branch 'feature/half-prec-comms' of https://github.com/paboyle/Grid into feature/half-prec-comms
2017-04-24 10:39:19 +01:00
Peter Boyle
5b55867a7a
Slightly cheaper Ext assembly
2017-04-24 05:36:11 -04:00
Peter Boyle
3accb1ef89
Debugged assemply split phase with interior suppression
2017-04-23 19:30:19 -04:00
Peter Boyle
e3d0e31525
Debugged assemply split phase with interior suppression
2017-04-23 19:29:27 -04:00
Peter Boyle
5812eb8a8c
Partially fixed. But the comms-overlap does not work yet.
2017-04-22 18:50:25 -04:00
paboyle
4dd3763294
Use OMP as much as possible
2017-04-22 20:35:20 +01:00
paboyle
c429ace748
Cleaner OpenMP use
2017-04-22 20:28:42 +01:00
paboyle
ac58565d0a
Dangerous rewrite of the assembly. If I make a mistake the debug will be painful.
2017-04-22 19:31:04 +01:00
paboyle
3703b718aa
Mark up a table if a given site only receives from itself; including MPI3 splitting info.
2017-04-22 19:28:37 +01:00
paboyle
b722889234
Try a better load balancing loop
2017-04-22 19:27:41 +01:00
paboyle
abba44a837
Hand unrolled for overlapped comms
2017-04-22 17:45:17 +01:00
paboyle
f301be94ce
Fixed
2017-04-22 17:42:31 +01:00
Peter Boyle
1d1b225497
Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node).
2017-04-22 09:05:28 -04:00
Peter Boyle
53a785a3dd
Fixing the KNL compile
2017-04-22 08:11:51 -04:00
paboyle
736bf3c866
Major rework of stencil. Half precision and MPI3 now working.
2017-04-22 11:33:50 +01:00
paboyle
b9bbe5d188
L1p config bg/q
2017-04-22 11:33:09 +01:00
paboyle
3844bcf800
If no f16c instructions supported must use software half precision conversion.
...
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
e1a2319d01
Simple compressor moved out of cshift into stencil
2017-04-20 13:18:15 +01:00
paboyle
180c732b4c
Move compressors out of Cshift.
...
Slice iterators would help
2017-04-20 13:17:55 +01:00
paboyle
957a706d0b
Useful script
2017-04-20 13:17:44 +01:00
paboyle
d2312e9874
Drop compressor entirely from Cshift to only Stencil.
2017-04-20 13:16:55 +01:00
paboyle
fc4ab9ccd5
Working half precision comms
2017-04-20 11:20:26 +01:00
paboyle
4a340aa5ca
Massive compressor rework to support reduced precision comms
2017-04-20 09:28:27 +01:00
paboyle
3b7de792d5
Type comparison in the traits work
2017-04-18 13:28:04 +01:00
paboyle
557c3fa109
Pretty change
2017-04-18 13:27:38 +01:00