portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 01:35:36 +00:00

Author	SHA1	Message	Date
Guido Cossu	8309f2364b	Solving again the MPI comm bug with FFTs	2017-10-25 10:24:14 +01:00
paboyle	07009c569a	Comms splitting improvements	2017-10-09 23:16:51 +01:00
paboyle	1feddf4ba6	const fixes	2017-06-22 19:32:41 +01:00
paboyle	6ebf9f15b7	Splitting communicators first cut	2017-06-22 08:14:34 +01:00
paboyle	3bfd1f13e6	I/O improvements	2017-06-11 23:14:10 +01:00
paboyle	4b17e8eba8	Merge branch 'develop' into feature/bgq-asm Conflicts: lib/qcd/action/fermion/Fermion.h lib/qcd/action/fermion/WilsonFermion.cc lib/util/Init.cc tests/Test_cayley_even_odd_vec.cc	2017-03-28 04:49:30 -04:00
paboyle	18bde08d1b	Merge branch 'feature/staggering' into develop	2017-03-28 15:25:55 +09:00
paboyle	4e7ab3166f	Refactoring header layout	2017-02-22 18:09:33 +00:00
Antonin Portelli	fad743fbb1	Build system sanity check: corrected several headers not in the <Grid/*> format	2017-01-26 17:00:41 -08:00
Azusa Yamaguchi	668ca57702	Merge branch 'develop' of https://github.com/paboyle/Grid into feature/staggering	2016-11-22 13:49:11 +00:00
azusayamaguchi	f85b35314d	Fix a routine for single node processor coor from rank	2016-11-08 11:49:13 +00:00
Azusa Yamaguchi	ee686a7d85	Compiles now	2016-11-03 16:58:23 +00:00
paboyle	791cb050c8	Comms improvements	2016-11-01 11:35:43 +00:00
azusayamaguchi	7c3363b91e	Compiles all comms targets	2016-10-25 00:04:17 +01:00
azusayamaguchi	b94478fa51	mpi, mpi3, shmem all compile. mpi, mpi3 pass single node multi-rank	2016-10-24 23:45:31 +01:00
azusayamaguchi	b6a65059a2	Update to use shared memory to contain the stencil comms buffers Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions	2016-10-24 17:30:43 +01:00
azusayamaguchi	c190221fd3	Internal SHM comms in non-simd directions working Need to fix simd directions	2016-10-22 18:14:27 +01:00
paboyle	32bc7a6ab8	MPI back out of change that hangs AVX2 for clang, gcc needs the -mfma flag.	2016-08-05 10:36:00 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	d6b64f47d9	Uint64 sum for IO rates	2016-03-16 02:27:22 -07:00
paboyle	e55c35734b	Fix a nocompile	2016-03-03 20:33:28 +00:00
paboyle	a3fbabf404	Bug fix	2016-02-18 18:08:24 +00:00
Peter Boyle	41c2b09184	Shmem comms [NO MPI] target added. The dwf test runs and passes. Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working. But committing my current while I try a few experimentals.	2016-02-14 14:24:38 -06:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
Peter Boyle	dc814f30da	Binary IO file for generic Grid array parallel I/O. Number of IO MPI tasks can be varied by selecting which dimensions use parallel IO and which dimensions use Serial send to boss I/O. Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes doing the I/O. Interpolates nicely between ALL nodes write their data, a single boss per time-plane in processor space [old UKQCD fortran code did this], and a single node doing all I/O. Not sure I have the transfer sizes big enough and am not overly convinced fstream is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero. Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations on my MacOS + OpenMPI and Clang environment. It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from each node in order to gather bigger chunks at the syscall level. That would push us up to the circa 8x 1848 == 4KB size write chunk, and by taking, say, x/y non parallel we get to 16MB contiguous chunks written in multi 4KB transactions per IOnode in 64^3 lattices for configuration I/O. I suspect this is fine for system performance.	2015-08-26 13:40:29 +01:00
Peter Boyle	1d0df449e8	Reorganise of file naming	2015-06-03 12:47:05 +01:00

26 Commits