mirror of
https://github.com/paboyle/Grid.git
synced 2026-05-03 17:04:12 +01:00
04072a5e1f
permutes as rotates of length 2, and make any rotate active over any subset of lane bits. This is hard, and requires general permute; current intrinsics mean this is only really possible for specific case by case encodings as presently performed. Intel could produce a general permute.. would help. IBM did it in VMX.