0

What is the simple equivalent C code to overcome __ functions like _mm_store_ps, _mm_add_ps, etc. Please specify anyone function through an example with the equivalent C code.

Why are these functions used?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
kamakshi
  • 35
  • 1
  • 8
  • 2
    You need equivalent C code of functions you don't even know the purpose? – Simone Dec 29 '10 at 07:48
  • I guess these are used for memory allignment of 16 byte.. bt hw are these alligned that i dont know. – kamakshi Dec 29 '10 at 07:50
  • According to MSDN: `_mm_add_ps`: *Adds four single-precision, floating-point values* and `_mm_store_ps`: *Stores four single-precision, floating-point values*. Your guess is not right. – Simone Dec 29 '10 at 07:52
  • thnx for correcting me,, but Hw can i write equivalent c code if i want to replace the function?? – kamakshi Dec 29 '10 at 07:59
  • 1
    Why do you think you need to do this? – Karl Knechtel Dec 29 '10 at 08:00
  • to make the code simpler and i guess these functionalities are giving me segmentation fault when i am working on linux environment – kamakshi Dec 29 '10 at 08:02
  • The reason you a getting seg faults is most likely because your data is not correctly aligned. You should fix your memory alignment rather than trying to rewrite the existing SSE code. – Paul R Dec 29 '10 at 08:12

2 Answers2

5

Based on your previous similar questions it sounds like you're trying to solve the wrong problem. You have some existing SSE code for face detection which is crashing because you are passing misaligned data to SSE routines that require 16 byte aligned data. In previous questions people have told you how to fix this misalignment (use _mm_malloc on Windows, or memalign/posix_memalign on Linux) but you seem to be ignoring this advice and instead you are wrongly assuming that you need to re-write all the SSE code. Take some time to understand what SSE is, how your SSE code works, why it needs 16 byte alignment and how to achieve this. Your existing SSE code should run fine on either Windows or Linux so long as you fix your data misalignment problem, which should be a relatively simple task once you understand what you are doing.

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • I hve used posix_memalign, but how will i check that its 16 byte alligned? – kamakshi Dec 29 '10 at 11:56
  • Read the man page for `posix_memalign`: http://linux.die.net/man/3/posix_memalign - you just need to pass 16 as the second parameter (alignment) to get 16 byte aligned memory. – Paul R Dec 29 '10 at 15:29
0

MSDN shows psuedo code for the first function,

void _mm_store_ps(float *p, __m128 a );

Returns:

p[0] := a0
p[1] := a1
p[2] := a2
p[3] := a3

http://msdn.microsoft.com/en-us/library/s3h4ay6y(v=vs.80).aspx

Steve-o
  • 12,678
  • 2
  • 41
  • 60
  • I think one has to point out that there's no way you to translate that function with "simple C code", since it operates with processor registers. – Simone Dec 29 '10 at 07:56
  • are these functions compatible to linux also?? – kamakshi Dec 29 '10 at 08:00
  • @kamakshi they are compiler intrinsics you need to lookup equivalents for whatever Linux compiler you use, if any exist. – Steve-o Dec 29 '10 at 09:17
  • @kamakshi Check the compiler docs for SSE instrinsics, however it looks like they are originally from ICC and everything pretty much supports the same API, including GCC. – Steve-o Dec 30 '10 at 03:27
  • @kamakshi OpenCV includes a face detection example. – Steve-o Dec 31 '10 at 08:55