portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-30 23:43:29 +01:00

Author	SHA1	Message	Date
paboyle	3493b51879	Modest updates	2016-07-14 23:52:13 +01:00
paboyle	de3e79d300	red black for Ls vectorised is 4d red black. Update accordingly now I've made this choice	2016-07-14 23:49:42 +01:00
paboyle	dd62a61c5c	Added broadcast and rotation of simd vectors	2016-07-14 23:49:00 +01:00
paboyle	8f47d0b5ab	Rotation needed for hopping term in fifth dim with Ls vectorised fields	2016-07-14 23:45:36 +01:00
paboyle	42af132dab	Fix for chris kellys request to peek poke on checkerboarded fields	2016-07-14 23:44:48 +01:00
paboyle	9db2c6525d	updating benchmarks for red black 4d for Ls vectorised code	2016-07-14 23:44:02 +01:00
paboyle	adbc7c1188	Adding files for multiple implementations (cache opt) and Ls vectorisation of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely in the vector direction, s-hopping terms involve rotations. The serial dependence of the LDU inversion for Mobius and 4d even odd checkerboarding is removed by simply applying Ls^2 operations (vectorised many ways) as a dense matrix operation. This should give similar throughput but high flops (non-compulsory flops) but enable use of the KNL cache friendly kernels throughout the code. Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512 with single precision.	2016-07-14 22:59:21 +01:00
paboyle	62601bb649	Bug fix	2016-07-08 20:46:29 +01:00
paboyle	ef97e32152	Adding persistent communicators	2016-07-08 17:16:08 +01:00
paboyle	c667d9fdcc	Trying to make compile clean on travis; seem to have a make -j 4 problem with fftw	2016-07-07 23:26:39 +01:00
paboyle	7dbb94bab2	Update	2016-07-07 22:51:37 +01:00
paboyle	236dcc820b	typo fix	2016-07-07 22:46:11 +01:00
paboyle	a42a441a6a	Rename the reconfigure script to ./autogen.sh	2016-07-07 22:35:45 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
paboyle	fc4a043663	Colors and banner clean up	2016-07-02 16:15:38 +01:00
paboyle	61ba50665e	Merge branch 'hotfix/v0.5.1' into develop	2016-07-01 16:34:30 +01:00
paboyle	bfe14000a9	Double compile fix	2016-07-01 16:33:51 +01:00
paboyle	1ceff48133	Merge branch 'release/v0.5.0' into develop	2016-06-30 15:15:59 -07:00
paboyle	680645f849	Merge branch 'release/v0.5.0'	2016-06-30 15:15:03 -07:00
paboyle	3fc6e03ad1	Version file v0.5.0	2016-06-30 14:44:09 -07:00
paboyle	2d6614f3a1	Merge branch 'feature/knl-cache-opt' into develop	2016-06-30 14:36:20 -07:00
paboyle	4e041b5103	Merge branch 'feature/knl-cache-opt' of https://github.com/paboyle/Grid into feature/knl-cache-opt	2016-06-30 14:36:08 -07:00
paboyle	712b9a3489	Asm only for avx512	2016-06-30 14:35:02 -07:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	1445189361	COntrol the prefetch strategy	2016-06-30 14:35:02 -07:00
paboyle	05c884a62a	Prefetch change	2016-06-30 14:35:01 -07:00
paboyle	a25bec87d9	Prefetch during save	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	51cb2d4328	update file lists	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
paboyle	c8b35d960c	Merge branch 'develop' of https://github.com/paboyle/Grid into feature/knl-cache-opt	2016-06-30 14:30:49 -07:00
paboyle	532f41dd61	Asm only for avx512	2016-06-30 14:00:34 -07:00
paboyle	661b0ab45d	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 13:07:42 -07:00
paboyle	4bc08ed995	Improved the prefetching when using cache blocking codes	2016-06-26 12:54:14 -07:00
paboyle	b2933a0557	COntrol the prefetch strategy	2016-06-25 12:55:25 -07:00
paboyle	db057cc276	Prefetch change	2016-06-25 12:54:50 -07:00
paboyle	22e88eaf54	Prefetch during save	2016-06-25 12:54:14 -07:00
paboyle	09fe3caebd	Tweaks	2016-06-25 11:08:05 -07:00
Guido Cossu	5e02392f9c	Fixed compilation error for benchmark_dwf Some parts were assuming floating point precision	2016-06-20 12:30:51 +01:00
paboyle	17a8f51a9b	update file lists	2016-06-19 11:59:10 -07:00
paboyle	1b7f88dd00	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-19 11:45:58 -07:00
portelli	d6737e4bd8	Travis fix for Linux clang builds	2016-06-14 19:15:08 +01:00
portelliandGitHub	d539888e57	Merge pull request #37 from rprollins/fix/mpi_communicator Removed write to stdout in constructor for MPI CartesianCommunicator	2016-06-14 17:25:40 +01:00
Richard Rollins	86187d7cca	Removed write to stdout in constructor for MPI CartesianCommunicator	2016-06-14 15:34:20 +01:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	05acc22920	placeholder for non temporal loads optimisation	2016-06-07 13:18:21 -07:00
paboyle	8ac021de73	Added a test an fixed it for red black precon Ls innermost vectorised DWF	2016-06-07 13:16:56 -07:00

1 2 3 4 5 ...