portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-05-15 22:54:30 +01:00

Author	SHA1	Message	Date
Jung	5c57d4f403	Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2 Conflicts: lib/qcd/action/fermion/WilsonKernels.h	2016-01-11 11:36:45 -05:00
Jung	5924e5a562	Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2 Conflicts: configure lib/qcd/action/Actions.h lib/qcd/action/fermion/WilsonKernels.h	2016-01-06 03:44:57 -05:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
Peter Boyle	dc814f30da	Binary IO file for generic Grid array parallel I/O. Number of IO MPI tasks can be varied by selecting which dimensions use parallel IO and which dimensions use Serial send to boss I/O. Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes doing the I/O. Interpolates nicely between ALL nodes write their data, a single boss per time-plane in processor space [old UKQCD fortran code did this], and a single node doing all I/O. Not sure I have the transfer sizes big enough and am not overly convinced fstream is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero. Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations on my MacOS + OpenMPI and Clang environment. It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from each node in order to gather bigger chunks at the syscall level. That would push us up to the circa 8x 1848 == 4KB size write chunk, and by taking, say, x/y non parallel we get to 16MB contiguous chunks written in multi 4KB transactions per IOnode in 64^3 lattices for configuration I/O. I suspect this is fine for system performance.	2015-08-26 13:40:29 +01:00
neo	48bf4878c1	Experimental support for ARM	2015-06-09 15:46:21 +09:00
Azusa Yamaguchi	58a4f32298	merge to the head	2015-06-05 10:15:31 +01:00
Peter Boyle	1d0df449e8	Reorganise of file naming	2015-06-03 12:47:05 +01:00
Peter Boyle	3845f267cb	Domain wall fermions now invert ; have the basis set up for Tanh/Zolo * (Cayley/PartFrac/ContFrac) * (Mobius/Shamir/Wilson) Approx Representation Kernel. All are done with space-time taking part in checkerboarding, Ls uncheckerboarded Have only so far tested the Domain Wall limit of mobius, and at that only checked that it i) Inverts ii) 5dim DW == Ls copies of 4dim D2 iii) MeeInv Mee == 1 iv) Meo+Mee+Moe+Moo == M unprec. v) MpcDagMpc is hermitan vi) Mdag is the adjoint of M between stochastic vectors. That said, the RB schur solve, RB MpcDagMpc solve, Unprec solve all converge and the true residual becomes small; so pretty good tests.	2015-06-02 16:57:12 +01:00
neo	74e91cd925	Partial implementation of the vector types SIMD Implementing SSE4 now A systematic series of tests must be written.	2015-05-19 17:21:17 +09:00
neo	baa382f055	Added check of mpfr and gmp at configure time It generates automatically the linker flags or complains if not found.	2015-05-19 13:54:55 +09:00
neo	b4cd37276b	Corrected some compilation errors (zolotarev.h) and SSE4 vsplat and conj to make cshift test pass.	2015-05-18 16:48:14 +09:00
Peter Boyle	b1d2c60d07	Moving some things around for pretty	2015-05-11 19:09:49 +01:00
paboyle	379943abf5	Command line args and a general clean up	2015-05-11 12:43:10 +01:00
Peter Boyle	29be76f958	Fixing breakage in the Comms non compile	2015-05-10 15:23:09 +01:00
Peter Boyle	193860dbc8	Comms and memory benchmarks added	2015-05-03 09:44:47 +01:00
Peter Boyle	f663be2a6c	Added a comms benchmark	2015-05-02 23:42:30 +01:00
Peter Boyle	9ec3529864	Improved the gamma quite a bit. Serial rng's which are set on node zero and broadcaste	2015-04-24 20:21:40 +01:00
Peter Boyle	b32c14b433	Got the NERSC IO working and fixed a bug in cshift.	2015-04-22 22:46:48 +01:00
Peter Boyle	e5a25dfcb1	Build reorg with which I am a bit happier	2015-04-18 21:22:50 +01:00

1 2 3

119 Commits