1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-20 02:31:01 +01:00
Commit Graph

245 Commits

Author SHA1 Message Date
paboyle 6d58cb2a68 Enable reordering of the loops in the assembler for cache friendly.
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
paboyle 87418e7df1 Slightly faster prefetching perf. 2016-06-13 02:32:52 -07:00
paboyle 55f65b81b5 Improvements to the assembler interface that let us move chunks of the
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi d9408893b3 Prefetching in the normal kernel implementation. 2016-06-08 05:43:48 -07:00
paboyle 139cc5f1ae Large change with KNL preparation 2016-06-03 03:24:26 -07:00
portelli 9d5f693cbe empty SIMD fix 2016-05-24 10:56:27 +01:00
portelli 91e04056f9 fix of the empty SIMD 2016-05-12 19:24:10 +01:00
paboyle c23375cd65 Testing travis CI integration 2016-04-30 06:30:56 -07:00
paboyle c79ea0dcef Fixingn IMCI 2016-04-22 21:52:54 -07:00
paboyle e3f141f82f Fixed SSE compile with typecasts 2016-04-22 10:30:30 -07:00
paboyle a6dfa2386b GCC choked on intrinsics calls that ICPC did not 2016-04-22 06:33:41 -07:00
paboyle 587f80cd93 Updated to compile and pass under intel SDE 2016-04-19 15:13:54 -07:00
paboyle 528eb773ad Merged.
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle e5657510b0 Rotate support for Ls simd-ized 2016-04-19 22:24:18 +01:00
paboyle f473919526 Rotate support 2016-04-19 22:23:51 +01:00
Christopher Kelly ab56ccdd25 -Complete and working implementation of Grid_empty 2016-04-15 13:17:42 -04:00
paboyle f473ef7591 Fixing the compile 2016-03-31 07:47:42 -07:00
paboyle 8052556275 Cleaning up the single/double kernel implementation switch 2016-03-31 14:51:32 +01:00
paboyle 83b15bfcdd Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign 2016-03-30 08:39:39 +01:00
paboyle c77b7ee897 AddSub based alternate SU3 routine 2016-03-28 17:55:22 -06:00
paboyle b6c3bc574b Moving to a more coherent organisation of the inline assembly and arch dependencies. 2016-03-28 16:24:37 +01:00
paboyle ad80f61fba AVX512 shaken out 2016-03-28 00:38:05 -06:00
paboyle 165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle 644fd6d32e Build avx512 clean 2016-03-25 09:35:33 -07:00
coppolachan 2d8bb356e3 Smearing routines compile (still untested) 2016-02-25 02:43:59 +09:00
coppolachan a7251f28c7 Stout smearing compiles (untested) 2016-02-24 03:16:50 +09:00
Antonin Portelli 497e7e4c53 BG/Q compatibility fix 2016-02-23 15:57:38 +00:00
paboyle aae8bf31a7 Global edit adding copyright and license info to every source file. 2016-01-02 14:51:32 +00:00
Azusa Yamaguchi 24a5a81c53 SSE compile fix 2015-12-16 09:09:37 +00:00
paboyle 3ce10aa975 Fix a regression failure on Mobius; chroma regression added 2015-12-10 22:55:00 +00:00
paboyle fa01ae5980 integer divide 2015-11-28 17:00:34 -08:00
Azusa Yamaguchi 4690acc3c8 Don't know why peter committed these as they didn't compile 2015-11-06 10:31:48 +00:00
paboyle 16c7993434 Merge branch 'master' of github.com:paboyle/Grid
Conflicts:
	lib/simd/Grid_avx512.h
	lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
Peter Boyle dfc1de6f60 Merge branch 'master' of github.com:paboyle/Grid 2015-11-04 05:14:26 -06:00
paboyle 9b5d31ffc1 mac , mult routines
Lines# with '#' will be ignored, and an empty message aborts the commit.
2015-11-04 03:10:34 -08:00
paboyle a38762159c Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.
Updated Grid_avx512.h
2015-11-04 03:09:06 -08:00
Peter Boyle ffc5dab17f AMD FMA4 support added for Interlagos/BlueWaters 2015-11-04 04:29:58 -06:00
Peter Boyle 814c79f38d SIMD improvements for mac and madd use in complex for avx, sse 2015-10-09 00:38:52 +02:00
paboyle f4b6d1dfea NGO stores reenabled 2015-09-30 16:02:14 -07:00
Peter Boyle 64d64d1ab6 Updating to modify non-inlining permute routines and hopefully get better reg use and
enhance performance.
2015-09-25 08:55:04 -07:00
Peter Boyle 5ef42add2d Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00
neo 490009745c Small change in the HMC interface.
Example of multiple levels in the WilsonFermion hmc test.

Merge remote-tracking branch 'upstream/master'

Conflicts:
	lib/qcd/hmc/HMC.h
	lib/qcd/hmc/integrators/Integrator.h
	lib/qcd/hmc/integrators/Integrator_algorithm.h
	tests/Test_simd.cc
2015-07-30 17:16:57 +09:00
Peter Boyle d9d4c5916a Elemental force term for Wilson dslash added and tests thereof passing.
Now need to construct pseudofermion two flavour, ratio, one flavour, ratio
action fragments.
2015-07-26 10:54:38 +09:00
neo 9adaeb061a More NEON functionalities 2015-07-21 11:52:15 +09:00
neo 0ffcdf6204 Debugged vector version of ProjectOnGroup 2015-07-06 02:24:58 +09:00
Peter Boyle 4deffd1ccb No compile fix 2015-07-02 02:03:09 +01:00
neo e31dfa79d1 Merge remote-tracking branch 'upstream/master' 2015-06-17 02:02:51 +09:00
neo 6e5db0b1da Corrected bug in integer multiplications for SSE4 and AVX2
Merge remote-tracking branch 'upstream/master'

Conflicts:
	tests/Make.inc
2015-06-16 23:34:45 +09:00
Azusa Yamaguchi 20fe866651 Critical bug fix of sin/cos typo 2015-06-16 14:17:45 +01:00
Azusa Yamaguchi 22c8185caa Binop assist and real/complex improvements 2015-06-14 00:59:07 +01:00