1

I have some SIMD code in Altivec processing 32 bit integer values in parallel. In some cases I want to load the integers as little endian, in other cases as big endian (note: this choice is regardless of the native CPU endianess; it is based on what algorithm is running). Doing the actual byte swap is very easy using Altivec's permute operations, as documented by Apple.

The part I'm worried about is that PowerPC allows either big or little endian operation, and so I don't know if I need to byte swap on little endian loads/stores or big endian loads/stores. (Currently my code just always does it for little endian and never swaps for big endian memory ops, which works fine on the 970 I'm currently using since of course it's running big-endian).

From what I can find, PPCs in little-endian mode are relatively rare, but they do exist, and ideally I'd like to have my code work correctly and quickly regardless of mode.

Is there a way of handling big and little endian loads to AltiVec registers regardless of CPU endianness? Are there other issues related to this I should know about? Wikipedia has the (uncited, naturally) statement:

"AltiVec operations, despite being 128-bit, are treated as if they were 64-bit. This allows for compatibility with little-endian motherboards that were designed prior to AltiVec."

which makes me think there may be other nastiness specific to AltiVec in little-endian mode.

Jack Lloyd
  • 8,215
  • 2
  • 37
  • 47
  • I'm not familiar with AltiVec but typically in code that must work with binary I/O, I try to start by programatically detecting the byte order, then swapping if necessary. Most likely not the answer you were seeking. – JR Lawhorne Oct 28 '09 at 21:11
  • That is about the best I can come up with too, but apparently PPC can actually change endianess at runtime by setting or clearing a bit in an MSR. And doing an endian check before every load or store is if nothing else really ugly, and probably also kind of slow, so I'm hoping there is some better method. – Jack Lloyd Oct 29 '09 at 00:17

1 Answers1

2

Pretty much all PowerPC code out there will assume big-endian and all ARM code out there will assume little endian.

There are a few specialized cases where endian-swapping is used — apparently VirtualPC relied on little endian mode and thus initially didn't work on the G5 (which doesn't include it) — but I wouldn't worry about these.

ARM has a similar problem in big-endian mode: doubles are mixed-endian. The "pseudo-endianness" is achieved by XORing the low-order address bits with 0x2 (for halfword accesses) and 0x3 (for byte accesses) so that the effective order within a 32-bit word is swapped, but this breaks for 64-bit accesses. I suspect the same trick is used on PowerPC except done 64 bits at a time.

tc.
  • 33,468
  • 5
  • 78
  • 96
  • Things have changed a lot since '11; most POWER machines will now be running little-endian (and there'll still be those that are BE too). Fortunately, you know the endianness at compile time, so if you need to support both, you can do a little `BYTE_ORDER` testing in the preprocessor. – Jeremy Kerr Sep 13 '17 at 03:17
  • That's completely unfactual and incorrect. All AIX machines are still BE, and all IBM i machines are BE. BE Linux is growing in popularity again and multiple distributions are (re-)adding support for it. You can never assume endianness on POWER and LE is just one part of a much larger story. – A. Wilcox Aug 14 '18 at 05:08