portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-08 03:13:29 +01:00

Author	SHA1	Message	Date
portelli	997fd882ff	Merge branch 'develop' into feature/feynman-rules # Conflicts: # lib/Threads.h # lib/qcd/action/fermion/WilsonFermion.cc # lib/qcd/action/fermion/WilsonFermion.h # lib/qcd/utils/SUn.h # lib/simd/Grid_avx.h # lib/simd/Intel512common.h	2016-10-19 18:35:18 +01:00
paboyle	a123dcd7e9	Static required for shmem. Reading same object twice requires csum reset	2016-10-12 00:29:57 +01:00
Guido Cossu	f76f281e58	Cleaning files after fix	2016-09-09 11:34:25 +01:00
portelli	64bf6fe54e	macro to dump NERSC header to a stream	2016-05-04 12:14:38 -07:00
paboyle	d4e57f4bc6	IO Bandwidth reporting	2016-03-16 02:30:16 -07:00
Peter Boyle	6aeaf6f568	Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then turned up problems on the BlueWaters Cray. Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel, and also to look at aggregating bigger writes for the parallel write. Not sure what the home filesystem is.	2016-02-21 08:03:21 -06:00
Peter Boyle	7f927a541c	Shmem related fixes for shmem compile	2016-02-11 07:37:39 -06:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
paboyle	5a80930dd2	Charge conjugation boundary conditions for gauge fields implemented as a policy class, changing the nature of covariant Cshifts used in plaquettes, rectangles and staples. As a result same code is used for the plaq and rect action independent of the BC type. Should probably isolate the BC in a separate class that Gimpl takes as a template param. Do the same with smearing policies. This would then allow composition of BC with smearing etc....	2016-01-02 13:37:25 +00:00
paboyle	31ca609d12	HMC checkpointing . Need a general HMC framework to work in restart.	2015-12-20 02:29:51 +00:00
paboyle	5710966324	Options to use mersenne twister OR ranlux48 via --enable-rng flag at configure time. Can save and restore RNG state via new (serial) I/O routines in a NERSC header style file. Store a Parallel (one per site) and a single serial RNG file.	2015-12-19 18:32:25 +00:00
Peter Boyle	96608c70d1	chrono causing some problems on Cray systems. Suspend use for now	2015-11-04 04:28:31 -06:00
Peter Boyle	d35d63b171	Algorithm in	2015-11-04 04:27:44 -06:00
Peter Boyle	aa52fdadcc	Global edit on HMC sector -- making GaugeField a template parameter and preparing to pass integrator, smearing, bc's as policy classes to hmc. Propose to unify "integrator" and integrator algorithm in a base/derived way to override step. Want to read through ForceGradient to ensure that abstraction covers the force gradient case.	2015-08-30 12:18:34 +01:00
Peter Boyle	76d752585b	Started a tidy up in the HMC sector. Now comfortable with the two level integrators; to a little figure out what Guido had done & why -- but there is a neat saving of force evaluations across the nesting time boundary making use of linearity of the leapP in dt. I cleaned up the printing, reduced the volume of code, in the process sharing printing between all integrators. Placed an assert that the total integration time for all integrators must match at end of trajectory. Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both Omelyan and with Leapfrog so substantial confidence gained.	2015-08-29 17:18:43 +01:00
Peter Boyle	dc814f30da	Binary IO file for generic Grid array parallel I/O. Number of IO MPI tasks can be varied by selecting which dimensions use parallel IO and which dimensions use Serial send to boss I/O. Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes doing the I/O. Interpolates nicely between ALL nodes write their data, a single boss per time-plane in processor space [old UKQCD fortran code did this], and a single node doing all I/O. Not sure I have the transfer sizes big enough and am not overly convinced fstream is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero. Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations on my MacOS + OpenMPI and Clang environment. It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from each node in order to gather bigger chunks at the syscall level. That would push us up to the circa 8x 1848 == 4KB size write chunk, and by taking, say, x/y non parallel we get to 16MB contiguous chunks written in multi 4KB transactions per IOnode in 64^3 lattices for configuration I/O. I suspect this is fine for system performance.	2015-08-26 13:40:29 +01:00
Peter Boyle	35818fdf6c	Text and Binary readers	2015-08-20 23:04:38 +01:00
Matt Spraggs	cff84f09ba	Removed std::string calls from NerscIO map indexing	2015-06-07 17:06:25 +01:00
Peter Boyle	1d0df449e8	Reorganise of file naming	2015-06-03 12:47:05 +01:00
Peter Boyle	17835c6f42	Remez tested	2015-05-18 12:09:25 +01:00
Peter Boyle	31fd146cc0	Improving the byte swap support for portability	2015-05-01 10:57:33 +01:00
mspraggs	6f05404cb8	Added <map> include to GridNerscIO.h Adding this allows clang to compile Grid to completion.	2015-04-29 23:44:03 +01:00
Peter Boyle	47292de769	Fixing endian on linux I hope	2015-04-23 07:51:15 +01:00
Peter Boyle	b32c14b433	Got the NERSC IO working and fixed a bug in cshift.	2015-04-22 22:46:48 +01:00

24 Commits