portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2024-11-10 07:55:35 +00:00

Author	SHA1	Message	Date
paboyle	340428a1fe	Eigen fixes and HDCR work	2016-03-30 00:16:02 -07:00
azusa	f54e0ec9bd	Try lanczos to set up hdcr subspace	2016-03-17 10:36:16 +00:00
paboyle	a155a362da	Update from HDCR tuning	2016-03-16 02:31:04 -07:00
paboyle	2dce9c3cff	HDCR running on 16^3 with 2x-3x speed up.	2016-03-08 01:01:50 -08:00
paboyle	dc72293398	More timing info	2016-03-06 10:46:55 -08:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
Antonin Portelli	200de272ed	IO: serialisable enums	2015-12-08 13:54:00 +00:00
Peter Boyle	f35fc4b76c	No compile fixes	2015-11-29 10:59:11 +00:00
paboyle	b8a38f292d	Domain decomposition SAP precon implemented and working but not as fast as I hoped.	2015-11-28 17:01:51 -08:00
Peter Boyle	dc814f30da	Binary IO file for generic Grid array parallel I/O. Number of IO MPI tasks can be varied by selecting which dimensions use parallel IO and which dimensions use Serial send to boss I/O. Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes doing the I/O. Interpolates nicely between ALL nodes write their data, a single boss per time-plane in processor space [old UKQCD fortran code did this], and a single node doing all I/O. Not sure I have the transfer sizes big enough and am not overly convinced fstream is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero. Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations on my MacOS + OpenMPI and Clang environment. It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from each node in order to gather bigger chunks at the syscall level. That would push us up to the circa 8x 1848 == 4KB size write chunk, and by taking, say, x/y non parallel we get to 16MB contiguous chunks written in multi 4KB transactions per IOnode in 64^3 lattices for configuration I/O. I suspect this is fine for system performance.	2015-08-26 13:40:29 +01:00
Peter Boyle	84a66476ab	Rework/global edit to enforce type templating of fermion operators. Allows multi-precision work and paves the way for alternate BC's and such like allowing for example G-parity which is important for K pipi programme. In particular, can drive an extra flavour index into the fermion fields using template types.	2015-08-10 20:47:44 +01:00
Peter Boyle	1d67d29183	Jackson smoothed chebyshev and (untested) completion of force terms for Cayley, Partial and Cont fraction dwf and overlap. have even odd and unprec forces.	2015-08-01 05:58:35 +09:00
Peter Boyle	d1afebf71e	Sizable improvement in multigrid for unsquared. 6000 matmuls CG unprec 2000 matmuls CG prec (4000 eo muls) 1050 matmuls PGCR on 16^3 x 32 x 8 m=.01 Substantial effort on timing and logging infrastructure	2015-07-24 01:31:13 +09:00
Peter Boyle	11c99d5e66	5x speed up now	2015-07-22 00:30:05 +09:00
Peter Boyle	487fde8496	This file is being developed and will remain hacky until the new algorithm is complete	2015-07-21 13:52:23 +09:00
paboyle	61c3491b8b	Remove dependency on wrong file	2015-07-01 13:04:02 +01:00
Peter Boyle	cd2fb68905	big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to near the bleeding edge I guess	2015-06-30 15:01:26 +01:00
Peter Boyle	a17684ebe2	Some small steps towards a multigrid	2015-06-22 12:49:44 +01:00

18 Commits