portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-06-16 17:13:11 +01:00

Author	SHA1	Message	Date
Guido Cossu	e1042aef77	First version of the doube prec for testing purposes It does not compile single and double version at the same time	2016-10-28 17:20:04 +01:00
azusayamaguchi	460d0753a1	Merge branch 'develop' into feature/mpi3 Conflicts: lib/simd/Grid_avx512.h	2016-10-25 01:08:51 +01:00
azusayamaguchi	75ebd3a0d1	Typo fixes and rotate for CLANG	2016-10-21 22:34:29 +01:00
azusayamaguchi	20a091c3ed	Intel vs. Clang intrinsics differences absorbed	2016-10-21 09:08:36 +01:00
paboyle	811ca45473	GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support	2016-10-17 16:23:21 +01:00
azusayamaguchi	81f2aeaece	KNL streaming stores, and KNL performance coutners	2016-10-12 11:45:22 +01:00
Guido Cossu	611b5d74ba	Fix for AVX+FMA3 compilation	2016-10-10 15:26:17 +01:00
Antonin Portelli	0724f7af75	QPX single precision implementation	2016-09-19 18:09:12 +01:00
portelli	4d11a6f5f2	first commit for QPX intrinsics	2016-08-23 14:41:44 +01:00
paboyle	17097a93ec	FFTW test ran over 4 mpi processes.	2016-08-17 01:33:55 +01:00
portelli	93d29bb699	build system improvements after discussion with Peter	2016-08-04 16:19:59 +01:00
portelli	e9f30cab2c	first working version for the new build system	2016-07-30 17:53:18 +01:00
paboyle	4908b77d46	Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created a HUGE amount of difficult to resolve and understand conflicts . Wholesale formatting, reordering functions etc... in a central file like Tensor_class or Grid_vector_types while others are also editing without making substantial functionality changes creates pain.	2016-07-15 20:59:07 +01:00
paboyle	f4dd5062d7	Merge branch 'develop' of https://github.com/paboyle/Grid into develop	2016-07-15 19:26:06 +01:00
paboyle	8f47d0b5ab	Rotation needed for hopping term in fifth dim with Ls vectorised fields	2016-07-14 23:45:36 +01:00
paboyle	a0676beeb1	Open up dependency on Eigen and FFTW	2016-07-07 22:31:07 +01:00
Guido Cossu	e3d5319470	Debugged the real() and imag() functions and added tests to Test_Simd	2016-07-06 14:16:03 +01:00
Guido Cossu	fdfbf11c6d	Merge branch 'develop' into temporary-smearing	2016-07-04 18:45:10 +01:00
Guido Cossu	9cb90f714e	Merge remote-tracking branch 'origin/develop' into temporary-smearing	2016-07-04 17:28:40 +01:00
Guido Cossu	1a6d65c6a4	Converted set_uw and set_fj to all complex functions	2016-07-03 10:27:43 +01:00
paboyle	bdaa5b1767	Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.	2016-06-30 14:35:02 -07:00
paboyle	8fcefc021a	Improved the prefetching when using cache blocking codes	2016-06-30 14:35:02 -07:00
paboyle	1445189361	COntrol the prefetch strategy	2016-06-30 14:35:02 -07:00
paboyle	a25bec87d9	Prefetch during save	2016-06-30 14:35:01 -07:00
paboyle	2d8bb4c594	Tweaks	2016-06-30 14:35:01 -07:00
paboyle	6d58cb2a68	Enable reordering of the loops in the assembler for cache friendly. This gets in the way of L2 prefetching however. Do next next link in stencil prefetching.	2016-06-30 14:35:01 -07:00
paboyle	87418e7df1	Slightly faster prefetching perf.	2016-06-13 02:32:52 -07:00
paboyle	55f65b81b5	Improvements to the assembler interface that let us move chunks of the site and s loop into the kernels. This will save on function call overhead and guarantee L2 prefetching strategy is right since OMP can't distribute the sub-chunks of work.	2016-06-09 01:12:36 -07:00
Azusa Yamaguchi	d9408893b3	Prefetching in the normal kernel implementation.	2016-06-08 05:43:48 -07:00
paboyle	139cc5f1ae	Large change with KNL preparation	2016-06-03 03:24:26 -07:00
portelli	9d5f693cbe	empty SIMD fix	2016-05-24 10:56:27 +01:00
portelli	91e04056f9	fix of the empty SIMD	2016-05-12 19:24:10 +01:00
paboyle	c23375cd65	Testing travis CI integration	2016-04-30 06:30:56 -07:00
paboyle	c79ea0dcef	Fixingn IMCI	2016-04-22 21:52:54 -07:00
paboyle	e3f141f82f	Fixed SSE compile with typecasts	2016-04-22 10:30:30 -07:00
paboyle	a6dfa2386b	GCC choked on intrinsics calls that ICPC did not	2016-04-22 06:33:41 -07:00
paboyle	587f80cd93	Updated to compile and pass under intel SDE	2016-04-19 15:13:54 -07:00
paboyle	528eb773ad	Merged. Merge branch 'master' of https://github.com/paboyle/Grid	2016-04-19 22:24:34 +01:00
paboyle	e5657510b0	Rotate support for Ls simd-ized	2016-04-19 22:24:18 +01:00
paboyle	f473919526	Rotate support	2016-04-19 22:23:51 +01:00
Christopher Kelly	ab56ccdd25	-Complete and working implementation of Grid_empty	2016-04-15 13:17:42 -04:00
paboyle	f473ef7591	Fixing the compile	2016-03-31 07:47:42 -07:00
paboyle	8052556275	Cleaning up the single/double kernel implementation switch	2016-03-31 14:51:32 +01:00
paboyle	83b15bfcdd	Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign	2016-03-30 08:39:39 +01:00
paboyle	c77b7ee897	AddSub based alternate SU3 routine	2016-03-28 17:55:22 -06:00
paboyle	b6c3bc574b	Moving to a more coherent organisation of the inline assembly and arch dependencies.	2016-03-28 16:24:37 +01:00
paboyle	ad80f61fba	AVX512 shaken out	2016-03-28 00:38:05 -06:00
paboyle	165bffc2e7	Avx512 changes for assembler kernels	2016-03-26 22:25:45 -06:00
paboyle	644fd6d32e	Build avx512 clean	2016-03-25 09:35:33 -07:00
coppolachan	2d8bb356e3	Smearing routines compile (still untested)	2016-02-25 02:43:59 +09:00

1 2 3

120 Commits