Peter Boyle
8d77d758c3
Parallel for replace
2015-05-15 11:48:04 +01:00
Peter Boyle
0e7945fe54
Forces inlining upon icpc
2015-05-15 11:43:49 +01:00
Peter Boyle
bd721ce1c8
Force inlining upon icpc
2015-05-15 11:43:20 +01:00
Peter Boyle
a852d13f03
More elegant enable_if
2015-05-15 11:42:51 +01:00
Peter Boyle
a26fdab719
More elegant to do boolean logic inside the enable_if construct
...
Should have done that from the beginning and should move this into
a global edit
2015-05-15 11:42:03 +01:00
Peter Boyle
af6e8f7829
Force inlining on ICPC because inline apparently is not enoguh
2015-05-15 11:41:31 +01:00
Peter Boyle
cbfa4097b4
strong_inline forces ICPC to do it.
2015-05-15 11:40:59 +01:00
Peter Boyle
8c40dd9c4f
Force strong_inline to force ipcc's hand
2015-05-15 11:40:31 +01:00
Peter Boyle
b38bf82d48
Switch to strong_inline macro to force icpc's hand
2015-05-15 11:40:00 +01:00
Peter Boyle
adc4f86020
Promote to strong inline to force ICPC's hand. Annoying.
2015-05-15 11:39:25 +01:00
Peter Boyle
5b46992a15
Formatting change
2015-05-15 11:38:54 +01:00
Peter Boyle
e7d25647e6
Filed bug report Bug 66153 on GCC-5.
...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66153
2015-05-15 11:38:04 +01:00
Peter Boyle
c28551f40f
Silly formatting change
2015-05-15 11:37:07 +01:00
Peter Boyle
6c7eb60d6f
gcc doesn't like collapse(2) for some reason I can't figure
2015-05-15 11:36:22 +01:00
Peter Boyle
051b23fe10
ICPC and GCC5 fixes
2015-05-15 11:35:02 +01:00
Peter Boyle
4e462209c7
Using boolean logic inside enable_if is more elegant
2015-05-15 11:32:45 +01:00
Peter Boyle
8d1b26dd4b
Key of mm_malloc.h
2015-05-15 11:32:11 +01:00
Peter Boyle
cc6218a692
strong inline required to force icpc
2015-05-15 11:31:41 +01:00
Peter Boyle
5166888c0a
Linear op added
2015-05-13 11:25:34 +01:00
Peter Boyle
0097b81778
OMP dslash working
2015-05-13 10:59:22 +01:00
Peter Boyle
add4495a4a
cout IO for all types
2015-05-13 09:24:10 +01:00
Peter Boyle
541d52ab97
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
556befaaaa
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
c6baa3e657
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
6e6843ac69
Moving some things around for pretty
2015-05-11 19:09:49 +01:00
Peter Boyle
c8dc8ff891
Adding a better controlled threading class, preparing to
...
force in deterministic reduction.
2015-05-11 18:59:03 +01:00
Peter Boyle
b613ed0bb8
Got command line args working
2015-05-11 14:36:48 +01:00
paboyle
b42453d1fd
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
Peter Boyle
2203c6e597
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
4da2c2ea00
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
1ec1b4ee44
Expression template hack
2015-05-10 15:35:30 +01:00
Peter Boyle
1ab92563b9
Expression template engin
2015-05-10 15:34:20 +01:00
Peter Boyle
dc7132af71
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
961fbb2718
Assertion should never hit, but did due to a bug
2015-05-10 15:24:37 +01:00
Peter Boyle
4a8fd55f52
Moving operator stuff into separate file so that we can switch on/off replacement with
...
expression templates
2015-05-10 15:23:49 +01:00
Peter Boyle
e02cbaa016
Fixing breakage in the Comms non compile
2015-05-10 15:23:09 +01:00
Peter Boyle
463c31ae09
Bringing expression templates for faster vector loops
2015-05-10 15:22:31 +01:00
Peter Boyle
52403d587c
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
cdd5cdeda2
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
cb4b82b09f
streaming store cases
2015-05-05 18:14:09 +01:00
Peter Boyle
cd990ba13d
Streaming store option
2015-05-05 18:13:06 +01:00
Peter Boyle
249165d1b2
Added streaming stores
2015-05-05 18:09:28 +01:00
Peter Boyle
2b46ad38e2
Back to vector for now; cost of init loop is clear in the a*x + y
...
loop in memory benchmark and must move to better container class.
2015-05-03 09:48:13 +01:00
Peter Boyle
9d93d1e6d4
Comms and memory benchmarks added
2015-05-03 09:44:47 +01:00
Peter Boyle
ea52562527
Added a comms benchmark
2015-05-02 23:42:30 +01:00
Peter Boyle
bdf18941a2
Improving the byte swap support for portability
2015-05-01 10:57:33 +01:00
Peter Boyle
d904e2b9ac
Merge branch 'master' of https://github.com/paboyle/Grid
2015-04-30 16:40:13 +01:00
Peter Boyle
c0ead94791
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
mspraggs
24fc71b2e9
Added <map> include to GridNerscIO.h
...
Adding this allows clang to compile Grid to completion.
2015-04-29 23:44:03 +01:00
Peter Boyle
dcc23faa4a
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00