1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-10 07:55:35 +00:00
Commit Graph

18 Commits

Author SHA1 Message Date
paboyle
340428a1fe Eigen fixes and HDCR work 2016-03-30 00:16:02 -07:00
azusa
f54e0ec9bd Try lanczos to set up hdcr subspace 2016-03-17 10:36:16 +00:00
paboyle
a155a362da Update from HDCR tuning 2016-03-16 02:31:04 -07:00
paboyle
2dce9c3cff HDCR running on 16^3 with 2x-3x speed up. 2016-03-08 01:01:50 -08:00
paboyle
dc72293398 More timing info 2016-03-06 10:46:55 -08:00
paboyle
aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
200de272ed IO: serialisable enums 2015-12-08 13:54:00 +00:00
Peter Boyle
f35fc4b76c No compile fixes 2015-11-29 10:59:11 +00:00
paboyle
b8a38f292d Domain decomposition SAP precon implemented and working but not as fast as I hoped. 2015-11-28 17:01:51 -08:00
Peter Boyle
dc814f30da Binary IO file for generic Grid array parallel I/O.
Number of IO MPI tasks can be varied by selecting which
dimensions use parallel IO and which dimensions use Serial send to boss
I/O.

Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes
doing the I/O.

Interpolates nicely between ALL nodes write their data, a single boss per time-plane
in processor space [old UKQCD fortran code did this], and a single node doing all I/O.

Not sure I have the transfer sizes big enough and am not overly convinced fstream
is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero.

Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations
on my MacOS + OpenMPI and Clang environment.

It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from
each node in order to gather bigger chunks at the syscall level.

That would push us up to the circa 8x 18*4*8 == 4KB size write chunk, and by taking, say, x/y non
parallel we get to 16MB contiguous chunks written in multi 4KB transactions
per IOnode in 64^3 lattices for configuration I/O.

I suspect this is fine for system performance.
2015-08-26 13:40:29 +01:00
Peter Boyle
84a66476ab Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle
1d67d29183 Jackson smoothed chebyshev and (untested) completion of force terms
for Cayley, Partial and Cont fraction dwf and overlap.
have even odd and unprec forces.
2015-08-01 05:58:35 +09:00
Peter Boyle
d1afebf71e Sizable improvement in multigrid for unsquared.
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01

Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
11c99d5e66 5x speed up now 2015-07-22 00:30:05 +09:00
Peter Boyle
487fde8496 This file is being developed and will remain hacky until the new algorithm
is complete
2015-07-21 13:52:23 +09:00
paboyle
61c3491b8b Remove dependency on wrong file 2015-07-01 13:04:02 +01:00
Peter Boyle
cd2fb68905 big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
near the bleeding edge I guess
2015-06-30 15:01:26 +01:00
Peter Boyle
a17684ebe2 Some small steps towards a multigrid 2015-06-22 12:49:44 +01:00