1

Please forgive my ignorance with the question below. We are supporting GCC 4.8 (and above) and IBM XL C/C++ 12 (and above). We are also supporting big and little-endian on AIX and Linux. The compilers and platforms have made the code fairly messy.

We want to load the constant 1 into a VSX register. This is the code we've been able to craft, but it seems wrong because its so complex. The macros XLC_VERSION, GCC_VERSION and LITTLE_ENDIAN have their customary meanings, so the additional preprocessor macros that lead to them have been omitted.

typedef __vector unsigned char      uint8x16_p8;
typedef __vector unsigned long long uint64x2_p8;

#if defined(XLC_VERSION)
    typedef uint8x16_p8 VectorType;
#elif defined(GCC_VERSION)
    typedef uint64x2_p8 VectorType;
#endif

#if defined(LITTLE_ENDIAN)
    const VectorType one = {1};
#else
    const VectorType one = (VectorType)((uint64x2_p8){0,1});
#endif

What's not apparent is, XL C/C++ supports all data arrangements and has a rich API set. The IBM compiler is a breeze to work with (when its not producing warnings and errors that are hard to understand).

GCC going back to 4.8 only supports the 64x2 arrangement and it only has a subset of the APIs. For example, GCC is missing IBM APIs for the 8x16 arrangement and GCC does not have vec_reve (which would make endian reversal easy).

What I really want to do is something like this and have it "just work" everywhere, but it fails to compile:

VectorType one = 1;

Is there a less complex way to load a small constant into a vector register?

jww
  • 97,681
  • 90
  • 411
  • 885
  • oh, are you trying to load 1 into each vector slot? or only one? – Jeremy Kerr Sep 13 '17 at 02:27
  • @Jeremy - Just one; not one splat'd.The problem I am having is, we have to handle the endian conversions manually. See, for example, [`rijndael.cpp` : 1152](https://github.com/weidai11/cryptopp/blob/master/rijndael-simd.cpp#L1152). I'd like something less complex and more intuitive, like the statement that does not compile. – jww Sep 13 '17 at 03:55
  • @Jeremy - Believe it or not, I spent 6 hours tracking down a single failed self test in AES/CTR mode. It was because I wanted `VectorType one = 1;` so I wrote `VectorType one = {1};`. It worked fine on a little endian Linux machine, but failed on a big endian AIX machine. Now I want to add a control that avoids the problem. This question is trying to find the control I can place so the problem does not surface in the future. – jww Sep 13 '17 at 04:14
  • Don't laugh at the value one. It was the real value that caused the problem. And don't laugh about the six hours, either. We've got half-working debuggers on AIX and Linux. And AIX seems to be completely missing some tools, like a disassembler, so I can examine the object files offline. Its a very miserable experience. – jww Sep 13 '17 at 04:14
  • So are you looking to treat the entire vector as a 128-bit value, or are you just needing a specific slot initialised? – Jeremy Kerr Sep 13 '17 at 04:54
  • 1
    @jww Regarding disassemblers on AIX: The XL compilers can generate assembly files instead of object files via the [-S](https://www.ibm.com/support/knowledgecenter/SSGH3R_13.1.0/com.ibm.xlcpp131.aix.doc/compiler_ref/opt_s_upper.html) option. You can also get pseudo assembly in a listing file (*.lst) via the -qlist option. Finally, the XL compilers ship with a `dis` tool in the `/opt/IBM/*/exe/` directory. For example, `dis a.out` creates `a.out.s`. – Rafik Zurob Sep 14 '17 at 02:57
  • @jww Thanks for your kind comments. Re: "What's not apparent is, XL C/C++ supports all data arrangements and has a rich API set.", if we could ask a favour, what could we improve in our documentation to help others realize this in the future? – Nicole Trudeau Sep 15 '17 at 21:59

2 Answers2

2

Your example of

VectorType one = 1;

is trying to assign a scalar to a vector. Try using a vector instead. For a 16-char vector, this would be:

vector char one = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};

gcc-4.8 seems to compile this okay; I don't have a LE 4.8 handy, but works for big-endian at least:

   0:   10 41 03 0c     vspltisb v2,1

LE with gcc-5 works fine too.

   0:   0c 03 41 10     vspltisb v2,1
Jeremy Kerr
  • 1,895
  • 12
  • 24
2

You might want to check the BCD_INIT example here. It uses a macro to reverse the vector initialization.

Regarding vec_reve: It's syntactic sugar for vec_perm. You can implement it as an inline function in a header or as a library function and use it for compilers that don't have it.

Rafik Zurob
  • 361
  • 2
  • 6