portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2024-11-14 01:35:36 +00:00

Author	SHA1	Message	Date
Antonin Portelli	7bb405e790	Merge branch 'develop' into feature/hadrons # Conflicts: # lib/communicator/Communicator_mpi3_leader.cc # lib/communicator/Communicator_shmem.cc	2018-01-11 18:50:15 +00:00
paboyle	0b2162f375	Clean up	2018-01-08 14:06:53 +00:00
paboyle	0091eec23a	Simplify communicator cases	2018-01-08 11:31:32 +00:00
Antonin Portelli	6718fa8c4f	Merge branch 'feature/scalar_adjointFT' into feature/hadrons	2017-12-26 12:59:33 +01:00
paboyle	a14038051f	Improved AllToAll asserts	2017-12-05 11:43:25 +00:00
Antonin Portelli	074d17429f	Merge branch 'develop' into feature/scalar_adjointFT # Conflicts: # lib/communicator/Communicator_mpi3.cc	2017-11-11 18:09:55 +00:00
paboyle	69929f20bb	Destructor fix. Split Grid and MPI3 will not yet work without more effort from me.	2017-11-06 23:45:00 +00:00
paboyle	501fa1614a	Communicator updates for split grid	2017-10-30 00:16:12 +00:00
paboyle	1ef424b139	Split grid Y2K bug fix attempt	2017-10-27 14:20:35 +01:00
Guido Cossu	8309f2364b	Solving again the MPI comm bug with FFTs	2017-10-25 10:24:14 +01:00
Antonin Portelli	5c392a6ecc	Merge commit 'bf58557fb1ec710c766e19c9a8809b0a352de239' into feature/scalar_adjointFT	2017-10-10 17:14:56 +01:00
paboyle	07009c569a	Comms splitting improvements	2017-10-09 23:16:51 +01:00
Guido Cossu	27caff92c6	Merge branch 'feature/scalar_adjointFT' of https://github.com/paboyle/Grid into feature/scalar_adjointFT	2017-10-04 09:44:27 +01:00
paboyle	d54807b8c0	MPIT works with split grid now	2017-10-02 23:14:56 +01:00
Guido Cossu	f6ba2b95ce	Merge branch 'develop' into feature/scalar_adjointFT	2017-10-02 15:19:20 +01:00
paboyle	4f8b6f26b4	Merge branch 'develop' into feature/dwf-multirhs	2017-10-02 11:41:49 +01:00
Guido Cossu	999c623590	Solving a memory leak in Communicator_mpi	2017-09-18 14:39:04 +01:00
paboyle	1cdf999668	Moving multicommunicator into mpi3 also for threading	2017-08-20 02:39:10 +01:00
paboyle	a446d95c33	Trying to pass TeamCity and Travis	2017-08-20 01:10:50 +01:00
Peter Boyle	14d53e1c9e	Threaded MPI calls patches	2017-07-29 13:08:10 -04:00
paboyle	54e94360ad	Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit	2017-06-24 23:10:24 +01:00
paboyle	6ebf9f15b7	Splitting communicators first cut	2017-06-22 08:14:34 +01:00
paboyle	3bfd1f13e6	I/O improvements	2017-06-11 23:14:10 +01:00
paboyle	e30fa9f4b8	RankCount; need to clean up ambigious ProcessCount	2017-05-30 23:39:16 +01:00
paboyle	3ae92fa2e6	Global changes to parallel_for structure. Move the comms flags to more sensible names	2017-02-21 05:24:27 -05:00
paboyle	37720c4db7	Count bytes off node only	2017-02-20 17:47:40 -05:00
paboyle	61f82216e2	Communicator Policy, NodeCount distinct from Rank count	2017-02-07 01:22:53 -05:00
paboyle	bb94ddd0eb	Tidy up of mpi3; also some cleaning of the dslash controls.	2016-11-02 08:07:09 +00:00
paboyle	791cb050c8	Comms improvements	2016-11-01 11:35:43 +00:00
azusayamaguchi	d7d92af09d	Travis fail fix attempt	2016-10-25 01:45:53 +01:00
azusayamaguchi	b6a65059a2	Update to use shared memory to contain the stencil comms buffers Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions	2016-10-24 17:30:43 +01:00
azusayamaguchi	c190221fd3	Internal SHM comms in non-simd directions working Need to fix simd directions	2016-10-22 18:14:27 +01:00
azusayamaguchi	fad96cf250	StencilBufs	2016-10-21 13:36:00 +01:00
paboyle	a762b1fb71	MPI3 working with a bounce through shared memory on my laptop. Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the send between ranks on same node.	2016-10-21 09:03:26 +01:00
paboyle	5fe2b85cbd	MPI3 and shared memory support	2016-10-20 16:58:01 +01:00
paboyle	32bc7a6ab8	MPI back out of change that hangs AVX2 for clang, gcc needs the -mfma flag.	2016-08-05 10:36:00 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	d6b64f47d9	Uint64 sum for IO rates	2016-03-16 02:27:22 -07:00
Peter Boyle	6aeaf6f568	Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then turned up problems on the BlueWaters Cray. Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel, and also to look at aggregating bigger writes for the parallel write. Not sure what the home filesystem is.	2016-02-21 08:03:21 -06:00
Peter Boyle	41c2b09184	Shmem comms [NO MPI] target added. The dwf test runs and passes. Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working. But committing my current while I try a few experimentals.	2016-02-14 14:24:38 -06:00
paboyle	e2f73e3ead	Updates for shmem	2016-02-10 16:50:32 -08:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
Peter Boyle	dc814f30da	Binary IO file for generic Grid array parallel I/O. Number of IO MPI tasks can be varied by selecting which dimensions use parallel IO and which dimensions use Serial send to boss I/O. Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes doing the I/O. Interpolates nicely between ALL nodes write their data, a single boss per time-plane in processor space [old UKQCD fortran code did this], and a single node doing all I/O. Not sure I have the transfer sizes big enough and am not overly convinced fstream is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero. Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations on my MacOS + OpenMPI and Clang environment. It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from each node in order to gather bigger chunks at the syscall level. That would push us up to the circa 8x 1848 == 4KB size write chunk, and by taking, say, x/y non parallel we get to 16MB contiguous chunks written in multi 4KB transactions per IOnode in 64^3 lattices for configuration I/O. I suspect this is fine for system performance.	2015-08-26 13:40:29 +01:00
Peter Boyle	1d0df449e8	Reorganise of file naming	2015-06-03 12:47:05 +01:00

44 Commits