portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-07 02:43:29 +01:00

Author	SHA1	Message	Date
portelli	e9f30cab2c	first working version for the new build system	2016-07-30 17:53:18 +01:00
paboyle	f4dd5062d7	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2016-07-15 19:26:06 +01:00
paboyle	9db2c6525d	updating benchmarks for red black 4d for Ls vectorised code	2016-07-14 23:44:02 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
Guido Cossu	fdfbf11c6d	Merge branch 'develop' into temporary-smearing	2016-07-04 18:45:10 +01:00
Guido Cossu	9cb90f714e	Merge remote-tracking branch 'origin/develop' into temporary-smearing	2016-07-04 17:28:40 +01:00
paboyle	bfe14000a9	Double compile fix	2016-07-01 16:33:51 +01:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	51cb2d4328	update file lists	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
Guido Cossu	565e9329ba	Changed the colouring classes	2016-06-30 16:51:03 +01:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
paboyle	05acc22920	placeholder for non temporal loads optimisation	2016-06-07 13:18:21 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00
paboyle	786ca52c43	Problems remain in the red black preconditioning of the Ls vectorisation	2016-06-06 07:05:51 -07:00
paboyle	53d06046b0	Compiling updates for KNL	2016-06-03 03:47:54 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
paboyle	f2ae9682ff	Remove some timing hacks	2016-04-19 15:14:32 -07:00
paboyle	528eb773ad	Merged. Merge branch 'master' of https://github.com/paboyle/Grid	2016-04-19 22:24:34 +01:00
paboyle	c323425496	Small change	2016-04-11 10:38:43 +01:00
paboyle	650e02b344	Smaller vols too	2016-04-06 06:52:09 -07:00
paboyle	a524ca2a4b	New benchmark update	2016-04-06 03:35:56 -07:00
paboyle	23a7176b71	Loop over volumes	2016-04-06 03:22:11 -07:00
paboyle	b1192a8908	Benchmark_zmm added	2016-04-06 03:00:07 -07:00
paboyle	e8dddb1596	Adding extra benchmark	2016-04-06 10:32:54 +01:00
paboyle	c77b7ee897	AddSub based alternate SU3 routine	2016-03-28 17:55:22 -06:00
paboyle	e17c773a0b	Longer runs for vtune	2016-03-16 02:29:13 -07:00
Peter Boyle	f7be108e35	100 iters faster	2016-02-15 16:03:04 -06:00
paboyle	fc6ad65751	Pushed the overlap comms tweaks	2016-01-11 06:34:22 -08:00
paboyle	02452afd36	Optional overlap of comms with compute	2016-01-04 14:18:40 +00:00
paboyle	331768dcff	Added overlap comms compute mode	2016-01-03 01:38:11 +00:00
paboyle	aae8bf31a7	Global edit adding copyright and license info to every source file.	2016-01-02 14:51:32 +00:00
paboyle	3ce10aa975	Fix a regression failure on Mobius; chroma regression added	2015-12-10 22:55:00 +00:00
paboyle	1cc0d7b811	Bigger ncall as timing loops got small on cori	2015-11-07 00:04:40 -08:00
Peter Boyle	27813cf518	More timing detail reported	2015-11-06 05:27:13 -06:00
paboyle	16c7993434	Merge branch 'master' of github.com:paboyle/Grid Conflicts: lib/simd/Grid_avx512.h lib/simd/Grid_imci.h	2015-11-04 03:32:10 -08:00
paboyle	32762346ad	Better run time on KNC	2015-11-04 03:25:34 -08:00
paboyle	0f48658a27	Update minor	2015-11-04 03:23:46 -08:00
Peter Boyle	dfc1de6f60	Merge branch 'master' of github.com:paboyle/Grid	2015-11-04 05:14:26 -06:00
Peter Boyle	b3d70a3bb2	Ncall change	2015-11-04 09:55:21 +00:00
Peter Boyle	c26220e9ab	EO benchmark as well as non-eo	2015-11-04 09:54:48 +00:00
Peter Boyle	3726fe7481	Bigger vec length	2015-10-09 00:42:54 +02:00
paboyle	af89c40462	Better timing tweaks to give sensible results on 24 threads on Edison dual ivybridge nodes.	2015-09-28 16:09:04 -07:00
Peter Boyle	9f4f65cb46	Added a decoupled memory system benchmark to remove thread synch overhead	2015-09-26 18:23:57 -07:00
Peter Boyle	9183380946	Gparity test added; partial implementation -- this is Chris K's doubled lattice only and have to regress this with the 2 flavour implementation.	2015-08-12 09:49:33 +01:00
Peter Boyle	84a66476ab	Rework/global edit to enforce type templating of fermion operators. Allows multi-precision work and paves the way for alternate BC's and such like allowing for example G-parity which is important for K pipi programme. In particular, can drive an extra flavour index into the fermion fields using template types.	2015-08-10 20:47:44 +01:00
Peter Boyle	d1afebf71e	Sizable improvement in multigrid for unsquared. 6000 matmuls CG unprec 2000 matmuls CG prec (4000 eo muls) 1050 matmuls PGCR on 16^3 x 32 x 8 m=.01 Substantial effort on timing and logging infrastructure	2015-07-24 01:31:13 +09:00

1 2

97 Commits