Peter Boyle
48f425d31c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
6cec662ac5
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
6103c29ee3
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
b1d2c60d07
Moving some things around for pretty
2015-05-11 19:09:49 +01:00
Peter Boyle
22d384b07d
Adding a better controlled threading class, preparing to
...
force in deterministic reduction.
2015-05-11 18:59:03 +01:00
Peter Boyle
f5dcca7b1b
Got command line args working
2015-05-11 14:36:48 +01:00
paboyle
43e71ff28c
CML parse
2015-05-11 12:56:27 +01:00
paboyle
379943abf5
Command line args and a general clean up
2015-05-11 12:43:10 +01:00
paboyle
9f9796b888
Updated to do list
2015-05-11 09:44:50 +01:00
Peter Boyle
5555a852be
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
48b9692845
Merge branch 'master' of https://github.com/paboyle/Grid
...
Conflicts:
lib/qcd/Grid_qcd_wilson_dop.cc
2015-05-10 15:37:47 +01:00
Peter Boyle
b802abc83f
Expression template hack
2015-05-10 15:35:30 +01:00
Peter Boyle
14591c72d6
Expression template engin
2015-05-10 15:34:20 +01:00
Peter Boyle
933ccfdc4f
Updated TODO list
2015-05-10 15:32:56 +01:00
Peter Boyle
4e596da589
Hack; must bring norm2 into the unary operator list.
...
ET's are still incomplete.
2015-05-10 15:30:29 +01:00
Peter Boyle
41c9785f3b
Default to single node. Move to command line args.
2015-05-10 15:27:38 +01:00
Peter Boyle
443efd875e
Single node default. Should expose this as command line args, but haven't sorted out
...
Grid_initialize to handle this. Should put this on the TODO list.
2015-05-10 15:26:06 +01:00
Peter Boyle
02ae26d091
Small tweak to enable benchmarking to suppress gauge field bandwidth as a test.
...
This is a short term hack while I benchmark.
2015-05-10 15:25:23 +01:00
Peter Boyle
2ffd941d67
Assertion should never hit, but did due to a bug
2015-05-10 15:24:37 +01:00
Peter Boyle
ca554f661b
Moving operator stuff into separate file so that we can switch on/off replacement with
...
expression templates
2015-05-10 15:23:49 +01:00
Peter Boyle
29be76f958
Fixing breakage in the Comms non compile
2015-05-10 15:23:09 +01:00
Peter Boyle
e3acb36de6
Bringing expression templates for faster vector loops
2015-05-10 15:22:31 +01:00
Peter Boyle
b2e0f72a7e
ET ready benchmark with bytes counted assuming loop interchange
2015-05-10 15:18:04 +01:00
Peter Boyle
ebed239e49
Updated todo list
2015-05-10 15:13:50 +01:00
Peter Boyle
55ccb8ccf4
Wilson perf improvements with Gauge prefetching
2015-05-06 06:37:21 +01:00
Peter Boyle
35d949cc17
Cleaned up for Linux
2015-05-05 22:09:22 +01:00
Peter Boyle
b9d16a7191
streaming store cases
2015-05-05 18:14:09 +01:00
Peter Boyle
07d57b6d55
Streaming store option
2015-05-05 18:13:06 +01:00
Peter Boyle
5ebc7a1756
Added streaming stores
2015-05-05 18:09:28 +01:00
Peter Boyle
bf60764e4b
Updated bandwidth test
2015-05-05 18:08:53 +01:00
Peter Boyle
890b13dd5b
Added a makefile
2015-05-05 17:56:42 +01:00
Peter Boyle
aeda7b923d
Back to vector for now; cost of init loop is clear in the a*x + y
...
loop in memory benchmark and must move to better container class.
2015-05-03 09:48:13 +01:00
Peter Boyle
193860dbc8
Comms and memory benchmarks added
2015-05-03 09:44:47 +01:00
Peter Boyle
99a1ff423d
Added a comms benchmark
2015-05-02 23:51:43 +01:00
Peter Boyle
f663be2a6c
Added a comms benchmark
2015-05-02 23:42:30 +01:00
Peter Boyle
4a1d4f1b3c
Starting a benchmarking sub dir
2015-05-02 17:52:36 +01:00
Peter Boyle
31fd146cc0
Improving the byte swap support for portability
2015-05-01 10:57:33 +01:00
Peter Boyle
c770f96be7
Merge branch 'master' of https://github.com/paboyle/Grid
2015-04-30 16:40:13 +01:00
Peter Boyle
a98c01c86a
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
d5b1bfb4bb
Merge pull request #1 from mspraggs/patch-1
...
Added <map> include to GridNerscIO.h
2015-04-30 09:46:48 +01:00
mspraggs
6f05404cb8
Added <map> include to GridNerscIO.h
...
Adding this allows clang to compile Grid to completion.
2015-04-29 23:44:03 +01:00
Peter Boyle
b7090ebba4
Benchmark wilson dhop now; 14.6GF on one core, not as fast as SU(3)xSU(3) [23GF] but still not too shabby.
...
Disassembling output shows ugly sequences in the permute sector. Could comparatively benchmark with and without
the if-else structure to see how much I'm losing.
Drops to 9GF as it falls out of cache. Moving to Lebesgue ordering should help there. Substantive progress.
2015-04-29 06:50:18 +01:00
Peter Boyle
c72db6c6f6
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
25d523c0f4
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
f159495a9d
Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required
2015-04-27 13:45:07 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
51f0da7b93
Starting the implementation of wilson; incomplete and committing non-functional code which
...
is not yet included from elsewhere or linked to the build system.
2015-04-25 14:33:02 +01:00
Peter Boyle
9dacdc947d
Update to TODO list
2015-04-25 13:04:26 +01:00
Peter Boyle
c5fa18eb20
Added two spinor functionality required to support the Wilson hopping term.
2015-04-25 12:54:06 +01:00
Peter Boyle
8b4073d84c
Dirac done ; remove from TODO
2015-04-24 22:56:37 +01:00