1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-11-10 15:55:37 +00:00
Commit Graph

1634 Commits

Author SHA1 Message Date
paboyle
ba427abde9 simd 5d 2016-04-19 15:38:39 -07:00
paboyle
9b6ab6db16 simd in 5th dimension support 2016-04-19 15:38:01 -07:00
paboyle
806a83d38b simd in fifth dim support for dwf 2016-04-19 15:36:19 -07:00
paboyle
7223753355 Rotate in a direction > 2 for simd_layout 2016-04-19 15:35:15 -07:00
paboyle
b27bac4669 Updates for simd in one dir 2016-04-19 15:34:10 -07:00
paboyle
c8a93d6a93 Cartesian changes to allow all simd in one direction 2016-04-19 15:18:12 -07:00
paboyle
04072a5e1f Rotate is a temporary hack. Would like to merge ALL
permutes as rotates of length 2, and make any rotate active
over any subset of lane bits. This is hard, and requires general
permute; current intrinsics mean this is only really possible for specific
case by case encodings as presently performed. Intel could produce a general
permute.. would help. IBM did it in VMX.
2016-04-19 15:15:34 -07:00
paboyle
574ea4f843 const safety 2016-04-19 15:15:11 -07:00
paboyle
f2ae9682ff Remove some timing hacks 2016-04-19 15:14:32 -07:00
paboyle
587f80cd93 Updated to compile and pass under intel SDE 2016-04-19 15:13:54 -07:00
paboyle
528eb773ad Merged.
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle
e5657510b0 Rotate support for Ls simd-ized 2016-04-19 22:24:18 +01:00
paboyle
f473919526 Rotate support 2016-04-19 22:23:51 +01:00
Peter Boyle
8f1b0afc2a Merge pull request #28 from aportelli/master
Build system fix
2016-04-16 09:55:45 +01:00
Peter Boyle
1494b0f397 Merge pull request #29 from giltirn/master
Grid_empty implementation and Lanzcos checkerboard fix
2016-04-16 09:55:24 +01:00
Christopher Kelly
ab56ccdd25 -Complete and working implementation of Grid_empty 2016-04-15 13:17:42 -04:00
cf2f69812b build system fix 2016-04-14 15:13:55 +01:00
paboyle
c323425496 Small change 2016-04-11 10:38:43 +01:00
Christopher Kelly
a646260e82 Merge remote-tracking branch 'origin/master' into ckelly-dec12-2015 2016-04-06 13:57:28 -04:00
Christopher Kelly
af9c8d1372 -Checkerboard fixes for Lanczos 2016-04-06 13:50:56 -04:00
paboyle
650e02b344 Smaller vols too 2016-04-06 06:52:09 -07:00
paboyle
a524ca2a4b New benchmark update 2016-04-06 03:35:56 -07:00
paboyle
23a7176b71 Loop over volumes 2016-04-06 03:22:11 -07:00
paboyle
b1192a8908 Benchmark_zmm added 2016-04-06 03:00:07 -07:00
paboyle
e8dddb1596 Adding extra benchmark 2016-04-06 10:32:54 +01:00
paboyle
c7ba47bdc7 Merge branch 'master' of https://github.com/paboyle/Grid 2016-04-06 02:56:28 +01:00
paboyle
e67fc2be18 Adding a trial for openmp overhead minimisation 2016-03-31 16:00:37 +01:00
paboyle
f473ef7591 Fixing the compile 2016-03-31 07:47:42 -07:00
paboyle
f7b1060aed Use headers to clear macros and sub precision 2016-03-31 14:52:37 +01:00
paboyle
8052556275 Cleaning up the single/double kernel implementation switch 2016-03-31 14:51:32 +01:00
paboyle
60d965f79e AVX512 improvements; sigfpe trapping too 2016-03-30 08:42:34 +01:00
paboyle
83b15bfcdd Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign 2016-03-30 08:39:39 +01:00
paboyle
1ecbf9794d Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-30 08:37:55 +01:00
paboyle
2ded354403 configure 2016-03-30 00:17:43 -07:00
paboyle
340428a1fe Eigen fixes and HDCR work 2016-03-30 00:16:02 -07:00
paboyle
c77b7ee897 AddSub based alternate SU3 routine 2016-03-28 17:55:22 -06:00
paboyle
b6c3bc574b Moving to a more coherent organisation of the inline assembly and arch dependencies. 2016-03-28 16:24:37 +01:00
paboyle
1e355a51e1 Interface change 2016-03-27 23:46:55 -07:00
paboyle
ad80f61fba AVX512 shaken out 2016-03-28 00:38:05 -06:00
paboyle
61469252fe AVX512 shaken out under SDE 2016-03-28 00:37:12 -06:00
paboyle
02198ac5b5 Tolerance and more coverage 2016-03-28 00:36:17 -06:00
paboyle
21abaf7e91 Gamma sign change 2016-03-28 00:35:45 -06:00
paboyle
165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e Build avx512 clean 2016-03-25 09:35:33 -07:00
azusa
f54e0ec9bd Try lanczos to set up hdcr subspace 2016-03-17 10:36:16 +00:00
paboyle
a155a362da Update from HDCR tuning 2016-03-16 02:31:04 -07:00
paboyle
60d4564151 ICC no compile fix 2016-03-16 02:30:40 -07:00
paboyle
d4e57f4bc6 IO Bandwidth reporting 2016-03-16 02:30:16 -07:00
paboyle
3920b2c0ab HDCR updates 2016-03-16 02:29:58 -07:00
paboyle
2733c4b93c hdcr updates 2016-03-16 02:29:37 -07:00