Peter Boyle
aeda7b923d
Back to vector for now; cost of init loop is clear in the a*x + y
...
loop in memory benchmark and must move to better container class.
2015-05-03 09:48:13 +01:00
Peter Boyle
193860dbc8
Comms and memory benchmarks added
2015-05-03 09:44:47 +01:00
Peter Boyle
99a1ff423d
Added a comms benchmark
2015-05-02 23:51:43 +01:00
Peter Boyle
f663be2a6c
Added a comms benchmark
2015-05-02 23:42:30 +01:00
Peter Boyle
4a1d4f1b3c
Starting a benchmarking sub dir
2015-05-02 17:52:36 +01:00
Peter Boyle
31fd146cc0
Improving the byte swap support for portability
2015-05-01 10:57:33 +01:00
Peter Boyle
c770f96be7
Merge branch 'master' of https://github.com/paboyle/Grid
2015-04-30 16:40:13 +01:00
Peter Boyle
a98c01c86a
Integrated Lebesgue code and been playing with alternate implementations of the wilson dop without
...
any particular success in increasing the performance.
2015-04-30 16:39:06 +01:00
Peter Boyle
d5b1bfb4bb
Merge pull request #1 from mspraggs/patch-1
...
Added <map> include to GridNerscIO.h
2015-04-30 09:46:48 +01:00
mspraggs
6f05404cb8
Added <map> include to GridNerscIO.h
...
Adding this allows clang to compile Grid to completion.
2015-04-29 23:44:03 +01:00
Peter Boyle
b7090ebba4
Benchmark wilson dhop now; 14.6GF on one core, not as fast as SU(3)xSU(3) [23GF] but still not too shabby.
...
Disassembling output shows ugly sequences in the permute sector. Could comparatively benchmark with and without
the if-else structure to see how much I'm losing.
Drops to 9GF as it falls out of cache. Moving to Lebesgue ordering should help there. Substantive progress.
2015-04-29 06:50:18 +01:00
Peter Boyle
c72db6c6f6
Fixed the stencil sector and Wilson now agrees between stencil based implementation
...
and the cshift based implementation. Managed to reduce the volume of code in this
sector a little, but consolidation would be good, perhaps taking common
logic out into simple helper functions
2015-04-29 06:23:56 +01:00
Peter Boyle
25d523c0f4
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
f159495a9d
Reworking CSHIFT and Stencil. Implementing Wilson and discovered rework is required
2015-04-27 13:45:07 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
51f0da7b93
Starting the implementation of wilson; incomplete and committing non-functional code which
...
is not yet included from elsewhere or linked to the build system.
2015-04-25 14:33:02 +01:00
Peter Boyle
9dacdc947d
Update to TODO list
2015-04-25 13:04:26 +01:00
Peter Boyle
c5fa18eb20
Added two spinor functionality required to support the Wilson hopping term.
2015-04-25 12:54:06 +01:00
Peter Boyle
8b4073d84c
Dirac done ; remove from TODO
2015-04-24 22:56:37 +01:00
Peter Boyle
9ec3529864
Improved the gamma quite a bit.
...
Serial rng's which are set on node zero and broadcaste
2015-04-24 20:21:40 +01:00
Peter Boyle
42eac283e2
static names and enum list
2015-04-24 19:12:14 +01:00
Peter Boyle
38598190c3
Vectors now too and right multiple of matrix with gamma
2015-04-24 19:08:29 +01:00
Peter Boyle
2e275e1e65
Removed summation
2015-04-24 18:42:44 +01:00
Peter Boyle
80463ecaea
Cleared the code out from Grid_summation to lattice/Grid_lattice_transfer.h
2015-04-24 18:41:34 +01:00
Peter Boyle
128ad0999f
Moved code from summation into transfer and reduction
2015-04-24 18:40:44 +01:00
Peter Boyle
d707c4e0a3
First implementation of Dirac matrices as a Gamma class.
2015-04-24 18:20:03 +01:00
Peter Boyle
b9939e3974
Reorganised the TODO. Really getting somewhere
2015-04-23 20:42:30 +01:00
Peter Boyle
3083d2e908
Rename Grid_QCD
2015-04-23 20:42:09 +01:00
Peter Boyle
898f64cdd7
move
2015-04-23 20:41:22 +01:00
Peter Boyle
52a6ba9767
Slice summation working. May move this into lattice/Grid_lattice_reduction however
2015-04-23 15:13:00 +01:00
Peter Boyle
4d2198ea56
Begginings of slice summation and subblocking
2015-04-23 11:04:59 +01:00
Peter Boyle
7007d6a176
Consolidate index to coor in a single routine
2015-04-23 11:04:19 +01:00
Peter Boyle
a37a9789c9
Snippets from Guido to optimise Reduce
2015-04-23 08:31:40 +01:00
Peter Boyle
5c8858f31b
Better description of Intel's many ISA targets
2015-04-23 08:02:51 +01:00
Peter Boyle
47292de769
Fixing endian on linux I hope
2015-04-23 07:51:15 +01:00
Peter Boyle
b32c14b433
Got the NERSC IO working and fixed a bug in cshift.
2015-04-22 22:46:48 +01:00
Peter Boyle
42f167ea37
Rework of RNG to use C++11 random. Should work correctly maintaining parallel RNG across
...
a machine. If a "fixedSeed" is used, randoms should be reproducible across different machine
decomposition since the generators are physically indexed and assigned in lexico ordering.
2015-04-19 14:55:58 +01:00
Peter Boyle
f6ab726cef
Update to task list
2015-04-19 14:55:16 +01:00
Peter Boyle
5483ed641e
Split all OMP directives into lattice subdir for easy maintainance of
...
parallelism and future OMP 4.0 offload.
2015-04-18 22:17:01 +01:00
Peter Boyle
d929f88421
Update
2015-04-18 22:16:31 +01:00
Peter Boyle
6bd11d920a
Finishing the reorg
2015-04-18 21:24:10 +01:00
Peter Boyle
8ddfa7e6b0
Reorganisation
2015-04-18 21:23:32 +01:00
Peter Boyle
e5a25dfcb1
Build reorg with which I am a bit happier
2015-04-18 21:22:50 +01:00
Peter Boyle
c94b7cc43c
Clean up
2015-04-18 20:52:40 +01:00
Peter Boyle
25a8266638
More files, shorter each.
2015-04-18 20:45:00 +01:00
Peter Boyle
6eae2c1083
Shrinking and organising the files
2015-04-18 20:44:19 +01:00
Peter Boyle
354347ce91
Split up into multiple files
2015-04-18 18:54:30 +01:00
Peter Boyle
2eb5ab26bf
splitting into smaller, multiple files for readability and easy find.
2015-04-18 18:47:43 +01:00
Peter Boyle
af72ade26a
Cleanup
2015-04-18 18:37:56 +01:00
Peter Boyle
e7661d3b12
Reorg
2015-04-18 18:37:22 +01:00