portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-07-04 09:23:30 +01:00

Author	SHA1	Message	Date
Peter Boyle	f9b2fce93b	Changing whole stencil class to be template and not just single functions	2015-11-06 05:25:10 -06:00
Peter Boyle	473fa28a6c	Partial optimisation; comms in x-dir for red black dslash will be slow as the checker skipping block strided loops are non threadable. Will need to write a kernel for these instead and drive them with a lookup table to make a look sufficiently simple to thread.	2015-11-06 05:23:23 -06:00
Peter Boyle	5d854c869c	Stencil interface changes	2015-11-06 05:22:33 -06:00
Peter Boyle	880ff88362	Comms optimisation	2015-11-06 05:22:18 -06:00
Azusa Yamaguchi	4690acc3c8	Don't know why peter committed these as they didn't compile	2015-11-06 10:31:48 +00:00
Azusa Yamaguchi	3281745fde	Exec info and linux check to stop non-portable code breaking	2015-11-06 10:31:24 +00:00
paboyle	1159de165c	Asm option for AVX512	2015-11-05 22:04:51 -08:00
paboyle	16c7993434	Merge branch 'master' of github.com:paboyle/Grid Conflicts: lib/simd/Grid_avx512.h lib/simd/Grid_imci.h	2015-11-04 03:32:10 -08:00
paboyle	6be9716e6f	New file	2015-11-04 03:26:28 -08:00
paboyle	4a41c885ed	Use Linux kernel interface to hardware performance counters. Dead useful.	2015-11-04 03:24:19 -08:00
paboyle	757b31ed42	Threading for KNC mods.	2015-11-04 03:22:14 -08:00
paboyle	ac7d1f26ad	Either blocking or lebesgue curve	2015-11-04 03:19:16 -08:00
paboyle	1a8bf938b3	Use either sub-blocking or lebesgue	2015-11-04 03:18:51 -08:00
paboyle	63a2993827	Exec info an cache blocking	2015-11-04 03:16:56 -08:00
paboyle	4e65ad21ac	Adding a routine for AVX512 / IMCI with explicit assembly implementations	2015-11-04 03:15:08 -08:00
Peter Boyle	dfc1de6f60	Merge branch 'master' of github.com:paboyle/Grid	2015-11-04 05:14:26 -06:00
Peter Boyle	3b7576ad53	Switch off for now	2015-11-04 05:13:29 -06:00
paboyle	9b5d31ffc1	mac , mult routines Lines# with '#' will be ignored, and an empty message aborts the commit.	2015-11-04 03:10:34 -08:00
paboyle	a38762159c	Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly. Updated Grid_avx512.h	2015-11-04 03:09:06 -08:00
Peter Boyle	ffc5dab17f	AMD FMA4 support added for Interlagos/BlueWaters	2015-11-04 04:29:58 -06:00
Peter Boyle	96608c70d1	chrono causing some problems on Cray systems. Suspend use for now	2015-11-04 04:28:31 -06:00
Peter Boyle	d35d63b171	Algorithm in	2015-11-04 04:27:44 -06:00
Peter Boyle	24044dbc56	Debugged a problem with checkerboarded cshift in the checker dimension which arose only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this	2015-11-04 10:00:55 +00:00
Peter Boyle	abb23df83f	formatting only	2015-11-04 10:00:27 +00:00
Peter Boyle	12c5ec813c	Useful debug messages (commented out) are included for preservation in case I need to revisit this	2015-11-04 09:59:27 +00:00
Peter Boyle	1271508ca2	Bug fix for spread out in x (EO) direction. This is really annoying -- it is very hard to thread the loops with the index recursion on buffer offset in the red-black case. Must think of a good threading solution here.	2015-11-04 09:57:57 +00:00
Peter Boyle	ec5af35166	EO bug fix when spread out in x-direction	2015-11-04 09:56:58 +00:00
Peter Boyle	0f59356e86	Problem in comms fixed	2015-11-02 00:00:15 +00:00
portelli	8709117aea	Log: generalised Logger class to allow separate logs in Grid-based applications	2015-10-27 17:31:13 +00:00
portelli	e6b9aa9076	Config.h removed form repository	2015-10-27 10:47:07 +00:00
Peter Boyle	8889af45ca	FMA4 added	2015-10-09 01:00:53 +02:00
Peter Boyle	83afb2e26a	Poly support for lanczos	2015-10-09 00:43:21 +02:00
Peter Boyle	6d06bd9493	Minor change in commented out code	2015-10-09 00:42:21 +02:00
Peter Boyle	6ee23f409e	Lanczos addition	2015-10-09 00:41:00 +02:00
Peter Boyle	2d95dac6b6	Lanczos untested/partially tested additions. In middle of shake out but at least compiles	2015-10-09 00:40:25 +02:00
Peter Boyle	814c79f38d	SIMD improvements for mac and madd use in complex for avx, sse	2015-10-09 00:38:52 +02:00
paboyle	1878bf97d0	Babbage fix	2015-09-30 16:04:01 -07:00
paboyle	a660ce716b	No compile babbage fix	2015-09-30 16:02:44 -07:00
paboyle	f4b6d1dfea	NGO stores reenabled	2015-09-30 16:02:14 -07:00
paboyle	23813ac798	No compile on babbage fix	2015-09-30 16:01:28 -07:00
Peter Boyle	9f4f65cb46	Added a decoupled memory system benchmark to remove thread synch overhead	2015-09-26 18:23:57 -07:00
Peter Boyle	64d64d1ab6	Updating to modify non-inlining permute routines and hopefully get better reg use and enhance performance.	2015-09-25 08:55:04 -07:00
Peter Boyle	5ef42add2d	Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly and drop swizzles in AVX512. Don't know why these compiled.	2015-09-23 05:23:45 -07:00
Peter Boyle	2f38ebc446	Reintroducing the hand unrolled loops	2015-09-08 17:45:30 +01:00
Peter Boyle	638d6675ee	Tested rms dH is ~ dt^4 numerically, so believe the ForceGradient is correct now. Paranoia makes me want to diddle with the FG step to ensure dt^2 reappears.	2015-08-31 16:33:20 +01:00
Peter Boyle	357c6ab46d	Reunitarise. Complete the HMC and integrator changes.	2015-08-31 16:32:04 +01:00
Peter Boyle	755dca9533	Added ForceGradient integrator. dH dropped so seems to work. Will only believe it is right once I have pulled a dt^4 error scaling plot out.	2015-08-31 06:23:02 +01:00
Peter Boyle	29fd004d54	Unified integrator and integrator algorithm into virtual class used as a policy for the HMC.	2015-08-30 13:39:19 +01:00
Peter Boyle	aa52fdadcc	Global edit on HMC sector -- making GaugeField a template parameter and preparing to pass integrator, smearing, bc's as policy classes to hmc. Propose to unify "integrator" and integrator algorithm in a base/derived way to override step. Want to read through ForceGradient to ensure that abstraction covers the force gradient case.	2015-08-30 12:18:34 +01:00
Peter Boyle	76d752585b	Started a tidy up in the HMC sector. Now comfortable with the two level integrators; to a little figure out what Guido had done & why -- but there is a neat saving of force evaluations across the nesting time boundary making use of linearity of the leapP in dt. I cleaned up the printing, reduced the volume of code, in the process sharing printing between all integrators. Placed an assert that the total integration time for all integrators must match at end of trajectory. Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both Omelyan and with Leapfrog so substantial confidence gained.	2015-08-29 17:18:43 +01:00

... 2 3 4 5 6 ...

1115 Commits