I have a lot of calculations with complex numbers (usually an array containing a struct consisting of two floats to represent im and re; see below) and want to speed them up with the NEON C intrinsics. It would be awesome if you could give me an example of how to speed up things like this:
for(n = 0;n < 1024;n++,p++,ptemp++){ // get cir_abs, also find the biggest point (value and location).
abs_squared = (Uns32)(((Int32)(p->re)) * ((Int32)(p->re))
+ ((Int32)(p->im)) * ((Int32)(p->im)));
// ...
}
p is an array of this kind:
typedef struct {
Int16 re;
Int16 im;
} Complex;
I already read through chapter 12 of "ARM C Language Extensions" but still have problems in understanding how to load and store my kind of construct here to do the calculations on it.