portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-03-11 23:16:14 +00:00

Author	SHA1	Message	Date
azusayamaguchi	202078eb1b	Cray / OpenSHMEM ordering differs	2016-10-21 09:07:20 +01:00
paboyle	a762b1fb71	MPI3 working with a bounce through shared memory on my laptop. Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the send between ranks on same node.	2016-10-21 09:03:26 +01:00
paboyle	5b5925b8e5	Forgot to add	2016-10-20 17:09:40 +01:00
paboyle	b58adc6a4b	commVector	2016-10-20 17:00:15 +01:00
paboyle	f9d5e95d72	allocator template typedefs moved to AlignedAllocator	2016-10-20 16:59:39 +01:00
paboyle	4f8e636a43	commVector	2016-10-20 16:59:16 +01:00
paboyle	9b39f35ae6	commVector different for SHMEM compat	2016-10-20 16:58:53 +01:00
paboyle	5fe2b85cbd	MPI3 and shared memory support	2016-10-20 16:58:01 +01:00
paboyle	c7cccaaa69	Comm vector for shmem	2016-10-20 16:57:31 +01:00
paboyle	cbcfea466f	MPI3	2016-10-20 16:57:14 +01:00
paboyle	4955672fc3	MPI3	2016-10-20 16:57:00 +01:00
paboyle	8c043da5b7	SHMEM and comms allocator made different	2016-10-20 16:56:05 +01:00
paboyle	3cbe974eb4	Layout	2016-10-20 16:55:21 +01:00
Antonin Portelli	997fd882ff	Merge branch 'develop' into feature/feynman-rules # Conflicts: # lib/Threads.h # lib/qcd/action/fermion/WilsonFermion.cc # lib/qcd/action/fermion/WilsonFermion.h # lib/qcd/utils/SUn.h # lib/simd/Grid_avx.h # lib/simd/Intel512common.h	2016-10-19 18:35:18 +01:00
paboyle	7af9b87318	Cache face tables to improve performance. Extract merge now looking poor.	2016-10-18 09:51:37 +01:00
paboyle	811ca45473	GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support	2016-10-17 16:23:21 +01:00
paboyle	bc1a4d40ba	Faster integer handling avoid push_back	2016-10-17 16:16:44 +01:00
paboyle	c8079e6621	Time the face gateher in x-dir more carefully	2016-10-13 22:28:50 +01:00
azusayamaguchi	8b0d171c9a	32bit issue on the KNL code variant where byte offsets were stored	2016-10-12 17:49:32 +01:00
azusayamaguchi	8bbd9ebc27	Reversing changes to Stencil class	2016-10-12 13:47:20 +01:00
azusayamaguchi	6472b431f0	__rdpmc needed for gcc, clang++	2016-10-12 12:29:08 +01:00
azusayamaguchi	bd205a3293	Fixing for non x86 and non KNL	2016-10-12 12:09:15 +01:00
azusayamaguchi	496beffa88	Fix non-KNL build	2016-10-12 12:06:08 +01:00
azusayamaguchi	9b63e97108	align not absolutely required and confuses clang++	2016-10-12 11:51:21 +01:00
azusayamaguchi	81f2aeaece	KNL streaming stores, and KNL performance coutners	2016-10-12 11:45:22 +01:00
paboyle	2d4a45c758	Typecast pointer	2016-10-12 09:14:15 +01:00
paboyle	a123dcd7e9	Static required for shmem. Reading same object twice requires csum reset	2016-10-12 00:29:57 +01:00
paboyle	6b27c42dfe	Cosmetic	2016-10-12 00:29:39 +01:00
paboyle	f7c2aa3ba5	runtime by default	2016-10-12 00:29:13 +01:00
paboyle	7240d73184	Parallelise the x faces; fix the segv on KNL with comms	2016-10-11 22:21:07 +01:00
paboyle	42cd148f5e	Base pointer for comms buffer under AVX512 assembly	2016-10-11 16:06:06 +01:00
paboyle	6e01264bb7	don't use static by default	2016-10-11 10:03:39 +01:00
paboyle	6f408256bc	FMA4 option moved on the align	2016-10-11 10:03:01 +01:00
paboyle	8d11681aac	verbose remove	2016-10-10 23:50:42 +01:00
paboyle	3d5c9a1ee9	No compile fix on clang++ 3.9	2016-10-10 23:50:13 +01:00
paboyle	dc389e467c	axpy_ssp for any coeff type via template	2016-10-10 23:48:05 +01:00
paboyle	3619167d62	Mass parameter	2016-10-10 23:47:33 +01:00
paboyle	96f1d1b828	Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass).	2016-10-10 23:46:45 +01:00
paboyle	657e0a8f4d	Mass parameter	2016-10-10 23:46:10 +01:00
paboyle	616e7cd83e	Mass parameter	2016-10-10 23:45:48 +01:00
paboyle	6f26d2e8d4	Overlap tree level feynman rule	2016-10-10 23:45:18 +01:00
paboyle	c014574504	A "please implement me" feynman rule. If this were abstract virtual it would require/force implementation	2016-10-10 23:44:00 +01:00
paboyle	d7ce164e6e	Feynman rule for DWF	2016-10-10 23:43:36 +01:00
paboyle	c0d5b99016	Dminus	2016-10-10 23:43:19 +01:00
paboyle	09ca32d678	Dminus added for Cayley	2016-10-10 23:42:55 +01:00
paboyle	082ae350c6	static schedule by default	2016-10-10 23:42:30 +01:00
Guido Cossu	611b5d74ba	Fix for AVX+FMA3 compilation	2016-10-10 15:26:17 +01:00
Guido Cossu	b56c9ffa52	Fix for AVXFMA	2016-10-10 14:43:37 +01:00
Guido Cossu	2e453dfbf5	Added some instrumentation to benchmark the force computation	2016-10-06 17:52:45 +01:00
paboyle	4089984431	Timing hooks	2016-10-06 09:25:12 +01:00

1 2 3 4 5 ...

1416 Commits