1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 02:31:01 +01:00
Commit Graph

93 Commits

Author SHA1 Message Date
Guido Cossu 5028969d4b Added generators for the adjoint representation 2016-07-08 15:40:11 +01:00
Guido Cossu fdfbf11c6d Merge branch 'develop' into temporary-smearing 2016-07-04 18:45:10 +01:00
Guido Cossu 9cb90f714e Merge remote-tracking branch 'origin/develop' into temporary-smearing 2016-07-04 17:28:40 +01:00
paboyle bfe14000a9 Double compile fix 2016-07-01 16:33:51 +01:00
paboyle 680645f849 Merge branch 'release/v0.5.0' 2016-06-30 15:15:03 -07:00
paboyle 2d8bb4c594 Tweaks 2016-06-30 14:35:01 -07:00
paboyle 51cb2d4328 update file lists 2016-06-30 14:35:01 -07:00
paboyle 6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu 565e9329ba Changed the colouring classes 2016-06-30 16:51:03 +01:00
Guido Cossu 5e02392f9c Fixed compilation error for benchmark_dwf
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle 55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
paboyle 05acc22920 placeholder for non temporal loads optimisation 2016-06-07 13:18:21 -07:00
paboyle 8ac021de73 Added a test an fixed it for red black precon Ls innermost vectorised DWF 2016-06-07 13:16:56 -07:00
paboyle 786ca52c43 Problems remain in the red black preconditioning of the Ls vectorisation 2016-06-06 07:05:51 -07:00
paboyle 53d06046b0 Compiling updates for KNL 2016-06-03 03:47:54 -07:00
paboyle 139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
paboyle f2ae9682ff Remove some timing hacks 2016-04-19 15:14:32 -07:00
paboyle 528eb773ad Merged.
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle c323425496 Small change 2016-04-11 10:38:43 +01:00
paboyle 650e02b344 Smaller vols too 2016-04-06 06:52:09 -07:00
paboyle a524ca2a4b New benchmark update 2016-04-06 03:35:56 -07:00
paboyle 23a7176b71 Loop over volumes 2016-04-06 03:22:11 -07:00
paboyle b1192a8908 Benchmark_zmm added 2016-04-06 03:00:07 -07:00
paboyle e8dddb1596 Adding extra benchmark 2016-04-06 10:32:54 +01:00
paboyle c77b7ee897 AddSub based alternate SU3 routine 2016-03-28 17:55:22 -06:00
paboyle e17c773a0b Longer runs for vtune 2016-03-16 02:29:13 -07:00
Peter Boyle f7be108e35 100 iters faster 2016-02-15 16:03:04 -06:00
paboyle fc6ad65751 Pushed the overlap comms tweaks 2016-01-11 06:34:22 -08:00
paboyle 02452afd36 Optional overlap of comms with compute 2016-01-04 14:18:40 +00:00
paboyle 331768dcff Added overlap comms compute mode 2016-01-03 01:38:11 +00:00
paboyle aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
paboyle 3ce10aa975 Fix a regression failure on Mobius; chroma regression added 2015-12-10 22:55:00 +00:00
paboyle 1cc0d7b811 Bigger ncall as timing loops got small on cori 2015-11-07 00:04:40 -08:00
Peter Boyle 27813cf518 More timing detail reported 2015-11-06 05:27:13 -06:00
paboyle 16c7993434 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/simd/Grid_avx512.h
	lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle 32762346ad Better run time on KNC 2015-11-04 03:25:34 -08:00
paboyle 0f48658a27 Update minor 2015-11-04 03:23:46 -08:00
Peter Boyle dfc1de6f60 Merge branch 'master' of github.com:paboyle/Grid 2015-11-04 05:14:26 -06:00
Peter Boyle b3d70a3bb2 Ncall change 2015-11-04 09:55:21 +00:00
Peter Boyle c26220e9ab EO benchmark as well as non-eo 2015-11-04 09:54:48 +00:00
Peter Boyle 3726fe7481 Bigger vec length 2015-10-09 00:42:54 +02:00
paboyle af89c40462 Better timing tweaks to give sensible results on 24 threads on Edison dual ivybridge nodes. 2015-09-28 16:09:04 -07:00
Peter Boyle 9f4f65cb46 Added a decoupled memory system benchmark to remove thread synch overhead 2015-09-26 18:23:57 -07:00
Peter Boyle 9183380946 Gparity test added; partial implementation -- this is Chris K's doubled lattice only
and have to regress this with the 2 flavour implementation.
2015-08-12 09:49:33 +01:00
Peter Boyle 84a66476ab Rework/global edit to enforce type templating of fermion operators.
Allows multi-precision work and paves the way for alternate BC's and such like
allowing for example G-parity which is important for K pipi programme.
In particular, can drive an extra flavour index into the fermion fields
using template types.
2015-08-10 20:47:44 +01:00
Peter Boyle d1afebf71e Sizable improvement in multigrid for unsquared.
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01

Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle 31a0c8d783 Merge branch 'master' of https://github.com/paboyle/Grid 2015-07-01 22:51:04 +01:00
paboyle 39271b02dd Modified memory bw test to display word size 2015-07-01 22:46:53 +01:00
Peter Boyle 638d2cda11 Change the SIMD command correctly with precision = double vs. single and
connect the "Real" default precisoin to a configure flag.
Have RealF, RealD and Real types, where Real is compile target dependent single/double,
RealF is single and RealD is double etc..
2015-07-01 22:45:15 +01:00
Peter Boyle 9143f071d7 Merge branch 'master' of https://github.com/paboyle/Grid 2015-06-30 15:17:46 +01:00