Questions tagged [altivec]

AltiVec is a floating point and integer SIMD instruction set designed and owned by Apple, IBM and Freescale Semiconductor, formerly the Semiconductor Products Sector of Motorola, (the AIM alliance), and implemented on versions of the PowerPC including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's PWRficient PA6T.

52 questions
2
votes
1 answer

AltiVec vec_msum equivalent for float values

Is anybody aware of a method to achieve vec_msum functionality against a vector of float values? I'm quite new to SIMD, and although I think I'm starting to make sense of it - there are still a few puzzles. My end goal is to rewrite the function…
Tim Kane
  • 2,599
  • 3
  • 20
  • 19
2
votes
1 answer

Altivec: analogue of _mm_sad_epu8()

I try to port a SSE function which get absolute difference of two 8-bit unsigned integer arrays. It looks like: uint64_t AbsDiffSum(const uint8_t * a, const uint8_t * b, size_t size) { assert(size%16 == 0); __m128i _sum =…
Georg
  • 110
  • 5
2
votes
2 answers

On Powerpc, is there any equivalent of intel's movemask intrinsics?

I'd like to merge all elements in a __vector bool long long into a single int, in which each bit is set to the most significant bit of the input vector example: __vector bool long long vcmp = vec_cmplt(a, b); int packedmask = /*SOME FUNCTION GOES…
Regis Portalez
  • 4,675
  • 1
  • 29
  • 41
2
votes
3 answers

Altivec -- load of const variable

What is the best way to load from a const pointer using altivec? According to the documentation (and my results) vec_ld doesn't take a const pointer as an…
user1829358
  • 1,041
  • 2
  • 9
  • 19
2
votes
1 answer

Is there a masked-blend instruction on PowerPC?

I'm trying to perform a masked blend (on __vector types) on a PowerPC (POWER 8). When looking to the intrinsics (list available here) I can see a vector select, but nothing for the merge. On x86 processors I know the intrinsic _mm256_blendv_ps, and…
Regis Portalez
  • 4,675
  • 1
  • 29
  • 41
2
votes
1 answer

x264 library speed - Altivec vs SSE4 -

I have simple cheap dualcore intel-3ghz-debian and access to super-expensive powerPc7-Aix. And after few days of strugle, i compiled libx264 and tested it on both computers: GCC: library x264 on intel (with SSE2 capabilities) and GCC on 16 core…
Asain Kujovic
  • 1,700
  • 13
  • 19
2
votes
2 answers

Avoiding invalid memory load with SIMD instructions

I am loading elements from memory using SIMD load instructions, let say using Altivec, assuming aligned addresses: float X[SIZE]; vector float V0; unsigned FLOAT_VEC_SIZE = sizeof(vector float); for (int load_index =0; load_index < SIZE;…
fsheikh
  • 416
  • 3
  • 12
1
vote
0 answers

How to add an immediate to a VSX register in open power ISA?

I want to increment the value of a VSX register by one. But there is no instruction in Open power ISA to add immediate to a VSX register. Does anybody have an idea?
shb8086
  • 31
  • 5
1
vote
1 answer

SIMD extensions on power: compiler flags and processor support

I am looking into porting a generic library of abstractions on top of SIMD to power architecture. However, the information about which extensions are supported on which power and how to compile to them is confusing. At the moment only looking at 64…
Denis Yaroshevskiy
  • 1,218
  • 11
  • 24
1
vote
4 answers

Assigning same memory to class member variables using unions

I am trying to vectorize existing Vector class class Vector { public: float X,Y,Z; }; Trying to vectorize the class members without affecting other classes accessing the these member variable class Vector { public: union{ float…
Rinesh
  • 21
  • 3
1
vote
1 answer

Error: matching constraint not valid in output operand

I'm having trouble getting GCC inline assembler to accept some inline assembly for Power9. The regular assembly I am trying to get GCC to accept is darn 3, 1, where 3 is r3 and 1 is parameter called L in the docs. It disassembles to this on…
jww
  • 97,681
  • 90
  • 411
  • 885
1
vote
2 answers

Clang equivalent of GCC's __builtin_darn()

I'm trying to discover Clang's equivalent to GCC's __builtin_darn() on Power9. Grepping Clang 7.0 sources it looks like LLVM supports it: llvm_source$ cat llvm/test/MC/PowerPC/ppc64-encoding.s | grep darn -B 1 -A 1 # CHECK-BE: darn 2, 3 …
jww
  • 97,681
  • 90
  • 411
  • 885
1
vote
1 answer

How to initialize a AltiVec register from scalars without using compound literals

I have some code like this void op(uint32_t B0, uint32_t B1, uint32_t B2, uint32_t B3) { auto v = (__vector unsigned int){B0, B1, B2, B3}; ... } When I compile it, GCC warns that "ISO C++ forbids compound-literals". Is there any other way to…
Jack Lloyd
  • 8,215
  • 2
  • 37
  • 47
1
vote
1 answer

How to print a vector variable as its 128-bit vsx value?

I'm trying to track down an endianess issue when running on a PowerPC with Power8. Big endian is OK, little endian is having some troubles. Below uint8x16_p8 is a typedef for __vector unsigned char. On a big endian machine I see: 1110 …
jww
  • 97,681
  • 90
  • 411
  • 885
1
vote
1 answer

Power8 vsldoi built-in or replacement

I'm trying to port some ASM code into C/C++ using built-ins. The ASM code has: + # Unpack a-h data from the packed vector to a vector register each + + vsldoi 10, 9, 9, 12 + vsldoi 11, 9, 9, 8 + vsldoi 12, 9, 9, 4 I can't find a built-in for…
jww
  • 97,681
  • 90
  • 411
  • 885