Adding two __m128 types via Accelerate framework

Question

I need to add/mul/sub two __m128 (float) variables using Accelerate framework. But, I can't find function to do that. All Accelerate framework functions takes int__vector__ type instead float__vector__ type. I find function for dividing 'vdivf', but I need to add/mul/sub too.

Can anyone tell me, how to add/mul/sub two __m128 (float) variables using Accelerate framework? Something like this: _mm_add_ps, _mm_sub_ps, _mm_mul_ps but using Accelerate framework API.

Why do you feel you need to use the Accelerate Framework for this ? Why not just use the intrinsics directly ? — Paul R, Apr 20 '12 at 19:47
I think, if Apple provide API for using MMX, SSE etc., will be better to use this API. I need to support PPC and Intel processors family, Accelerate framework will be processed CPU command support automaticly. In addition, if something will be changed in future, I think, using Accelerate framework I will need to make less changes. — Lexandr, Apr 21 '12 at 06:07
OK, but you don't want to call Accelerate functions just for single vectors - that would be hopelessly inefficient and pointless - you need to process reasonably sized arrays otherwise the function call overhead will wipe out any gains from using SIMD. — Paul R, Apr 21 '12 at 06:48
Thank you, Paul, but problem in next: I have an existing library on C++ for Windows platform and I need to use SSE2 with minimum! code changes. This code use `_mm_add_ps` and etc functions, I need to replace calling `_mm_add_ps` and etc functions with analog from Accelerate framework, because of mentioned reasons. Maby you can tell me, what use instead? — Lexandr, Apr 21 '12 at 07:11
Thank you Paul, I was hoping that I missed something and just have not found that functions. — Lexandr, Apr 21 '12 at 07:56
No problem - I have converted that last comment to an answer now (see below) and expanded on it a little further. — Paul R, Apr 21 '12 at 07:58

Paul R · Answer 1 · 2012-04-21T08:12:37.910

The problem is that Accelerate is a higher level API than using SSE2 intrinsics. SSE intrinsics map to single instructions which operate on one vector at a time. Accelerate provides a higher level API of functions which operate at a much larger granularity, typically with arrays of a reasonable size. To port your existing code you should just stick with SSE intrinsics, and if you really do need PowerPC support then you'll need to #idef the SSE code and write an equivalent AltiVec implementation for the ppc build. I doubt this will be worth the effort however - Apple stopped selling PowerPC Macs around 7 years ago, so the market for PowerPC apps must be very small by now.

score -1 · Accepted Answer · answered Apr 22 '12 at 04:13

-1

You don't need an API for basic arithmetic:

__m128 x, y;
__m128 z = x + y;
__m128 w = x - y;
__m128 t = x * y;

An API would be totally unnecessary for these operations, so Accelerate doesn't have one.

That said, if you have existing code which uses the SSE intrinsics (_mm_add_ps, etc), and you're really trying to make "minimum code changes", why are you changing anything at all? The SSE intrinsics work just fine on OS X as well.

answered Apr 22 '12 at 04:13

Stephen Canon

103,815
19
183
269

"You don't need an API for basic arithmetic" What do you mean? That this operations will be automatically use SSE instructions? – Lexandr Apr 23 '12 at 08:15
@Lexandr: yes, they will. However, as I noted, if you already have working code that uses the intrinsics, *don't change it*. – Stephen Canon Apr 23 '12 at 13:55

Adding two __m128 types via Accelerate framework

2 Answers2