0


I need to add/mul/sub two __m128 (float) variables using Accelerate framework. But, I can't find function to do that. All Accelerate framework functions takes int__vector__ type instead float__vector__ type. I find function for dividing 'vdivf', but I need to add/mul/sub too.

Can anyone tell me, how to add/mul/sub two __m128 (float) variables using Accelerate framework? Something like this: _mm_add_ps, _mm_sub_ps, _mm_mul_ps but using Accelerate framework API.

Lexandr
  • 679
  • 1
  • 6
  • 22
  • Why do you feel you need to use the Accelerate Framework for this ? Why not just use the intrinsics directly ? – Paul R Apr 20 '12 at 19:47
  • I think, if Apple provide API for using MMX, SSE etc., will be better to use this API. I need to support PPC and Intel processors family, Accelerate framework will be processed CPU command support automaticly. In addition, if something will be changed in future, I think, using Accelerate framework I will need to make less changes. – Lexandr Apr 21 '12 at 06:07
  • OK, but you don't want to call Accelerate functions just for single vectors - that would be hopelessly inefficient and pointless - you need to process reasonably sized arrays otherwise the function call overhead will wipe out any gains from using SIMD. – Paul R Apr 21 '12 at 06:48
  • Thank you, Paul, but problem in next: I have an existing library on C++ for Windows platform and I need to use SSE2 with minimum! code changes. This code use `_mm_add_ps` and etc functions, I need to replace calling `_mm_add_ps` and etc functions with analog from Accelerate framework, because of mentioned reasons. Maby you can tell me, what use instead? – Lexandr Apr 21 '12 at 07:11
  • Thank you Paul, I was hoping that I missed something and just have not found that functions. – Lexandr Apr 21 '12 at 07:56
  • No problem - I have converted that last comment to an answer now (see below) and expanded on it a little further. – Paul R Apr 21 '12 at 07:58

2 Answers2

1

The problem is that Accelerate is a higher level API than using SSE2 intrinsics. SSE intrinsics map to single instructions which operate on one vector at a time. Accelerate provides a higher level API of functions which operate at a much larger granularity, typically with arrays of a reasonable size. To port your existing code you should just stick with SSE intrinsics, and if you really do need PowerPC support then you'll need to #idef the SSE code and write an equivalent AltiVec implementation for the ppc build. I doubt this will be worth the effort however - Apple stopped selling PowerPC Macs around 7 years ago, so the market for PowerPC apps must be very small by now.

Paul R
  • 208,748
  • 37
  • 389
  • 560
-1

You don't need an API for basic arithmetic:

__m128 x, y;
__m128 z = x + y;
__m128 w = x - y;
__m128 t = x * y;

An API would be totally unnecessary for these operations, so Accelerate doesn't have one.

That said, if you have existing code which uses the SSE intrinsics (_mm_add_ps, etc), and you're really trying to make "minimum code changes", why are you changing anything at all? The SSE intrinsics work just fine on OS X as well.

Stephen Canon
  • 103,815
  • 19
  • 183
  • 269
  • "You don't need an API for basic arithmetic" What do you mean? That this operations will be automatically use SSE instructions? – Lexandr Apr 23 '12 at 08:15
  • @Lexandr: yes, they will. However, as I noted, if you already have working code that uses the intrinsics, *don't change it*. – Stephen Canon Apr 23 '12 at 13:55