I need to add/mul/sub two __m128
(float) variables using Accelerate framework. But, I can't find function to do that. All Accelerate framework functions takes int__vector__
type instead float__vector__
type. I find function for dividing 'vdivf'
, but I need to add/mul/sub too.
Can anyone tell me, how to add/mul/sub two __m128
(float) variables using Accelerate framework? Something like this: _mm_add_ps
, _mm_sub_ps
, _mm_mul_ps
but using Accelerate framework API.

- 679
- 1
- 6
- 22
-
Why do you feel you need to use the Accelerate Framework for this ? Why not just use the intrinsics directly ? – Paul R Apr 20 '12 at 19:47
-
I think, if Apple provide API for using MMX, SSE etc., will be better to use this API. I need to support PPC and Intel processors family, Accelerate framework will be processed CPU command support automaticly. In addition, if something will be changed in future, I think, using Accelerate framework I will need to make less changes. – Lexandr Apr 21 '12 at 06:07
-
OK, but you don't want to call Accelerate functions just for single vectors - that would be hopelessly inefficient and pointless - you need to process reasonably sized arrays otherwise the function call overhead will wipe out any gains from using SIMD. – Paul R Apr 21 '12 at 06:48
-
Thank you, Paul, but problem in next: I have an existing library on C++ for Windows platform and I need to use SSE2 with minimum! code changes. This code use `_mm_add_ps` and etc functions, I need to replace calling `_mm_add_ps` and etc functions with analog from Accelerate framework, because of mentioned reasons. Maby you can tell me, what use instead? – Lexandr Apr 21 '12 at 07:11
-
Thank you Paul, I was hoping that I missed something and just have not found that functions. – Lexandr Apr 21 '12 at 07:56
-
No problem - I have converted that last comment to an answer now (see below) and expanded on it a little further. – Paul R Apr 21 '12 at 07:58
2 Answers
The problem is that Accelerate is a higher level API than using SSE2 intrinsics. SSE intrinsics map to single instructions which operate on one vector at a time. Accelerate provides a higher level API of functions which operate at a much larger granularity, typically with arrays of a reasonable size. To port your existing code you should just stick with SSE intrinsics, and if you really do need PowerPC support then you'll need to #idef the SSE code and write an equivalent AltiVec implementation for the ppc build. I doubt this will be worth the effort however - Apple stopped selling PowerPC Macs around 7 years ago, so the market for PowerPC apps must be very small by now.

- 208,748
- 37
- 389
- 560
You don't need an API for basic arithmetic:
__m128 x, y;
__m128 z = x + y;
__m128 w = x - y;
__m128 t = x * y;
An API would be totally unnecessary for these operations, so Accelerate doesn't have one.
That said, if you have existing code which uses the SSE intrinsics (_mm_add_ps
, etc), and you're really trying to make "minimum code changes", why are you changing anything at all? The SSE intrinsics work just fine on OS X as well.

- 103,815
- 19
- 183
- 269
-
"You don't need an API for basic arithmetic" What do you mean? That this operations will be automatically use SSE instructions? – Lexandr Apr 23 '12 at 08:15
-
@Lexandr: yes, they will. However, as I noted, if you already have working code that uses the intrinsics, *don't change it*. – Stephen Canon Apr 23 '12 at 13:55