1

I'm trying to port some ARM NEON code to AltiVec. Our NEON code has two LOAD's, one ROT, one XOR and a STORE so it seems like a simple test case. According to IBM's vec_rl documentation:

Each element of the result is obtained by rotating the corresponding element of a left by the number of bits specified by the corresponding element of b.

The docs go on to say vector unsigned int is the largest data type unless -qarch=power8, in which case vector unsigned long long applies.

I'd like to perform a 128-bit rotate, and not 32-bit or 64-bit rotation of individual elements. The bit positions are 19, 31, 67, 97, and 109. They are not byte aligned. (The constants arise from the ARIA block cipher).

Are 4x32 and 2x64 the largest AltiVec data arrangements? Is it possible to rotate a 128-bit value in Altivec?

If the packed rotate is the only operation available, then is it a best practice to do the bit twiddling in C or in AltiVec?

jww
  • 97,681
  • 90
  • 411
  • 885

1 Answers1

1

You can do a rotate by a multiple of 8 bits using vsld (vec_sld), then to handle any remaining rotation of < 8 bits you'll probably need to use vsl + vsr + vsel (vec_sll + vec_srl + vec_sel).

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • Thanks Paul. I'm having trouble finding information on `vec_sll` and `vec_srl`. They are not documented at IBM's [Vector built-in functions](https://www.ibm.com/support/knowledgecenter/en/SSGH2K_13.1.2/com.ibm.xlc131.aix.doc/compiler_ref/vec_intrin_cpp.html). Would you be able to share more information? – jww Sep 02 '17 at 18:59
  • On a mobile device just now, but Google "AltiVec PIM" and you should see a PDF in the top few hits which documents all the intrinsics etc. The companion manual is "AltiVec PEM" which documents the actual instructions. – Paul R Sep 02 '17 at 20:15
  • Note: your question is tagged `PowerPC` - are you actually working with PowerPC/AltiiVec or is it IBM POWER/VMX (similar but different). – Paul R Sep 02 '17 at 20:18
  • Thanks Paul. I'm working on [GCC112 from the compile farm](https://gcc.gnu.org/wiki/CompileFarm), which is described as IBM POWER8. – jww Sep 02 '17 at 20:24
  • Right, but what hardware platform are you targeting ? POWER or PowerPC ? – Paul R Sep 02 '17 at 20:25
  • Thanks again Paul. The command line I am using is `/opt/cfarm/gcc-latest/bin/g++ -g3 -O1 -m64 -maltivec -mabi=altivec -mcrypto -pthread -c rijndael-simd.cpp` (where rijndael is a similar problem). My apologies if I should be further along. I have not found a good article or tutorial explaining things, so I'm cobbling things together. As I understand things, both AltiVec and IBM offer the extensions under different trade names. They should be the same instructions. Please correct me if I am wrong. – jww Sep 02 '17 at 20:55
  • OK - well if you're targeting PowerPC/AltiVec then those 128 bit shift intrinsics should be available in altivec.h. Did you find the AltiVec PIM PDF ? – Paul R Sep 02 '17 at 20:57