My code relies heavily on computing distances between two points in 3D space. To avoid the expensive square root I use the squared distance throughout. But still it takes up a major fraction of the computing time and I would like to replace my simple function with something even faster. I now have:
double distance_squared(double *a, double *b)
{
double dx = a[0] - b[0];
double dy = a[1] - b[1];
double dz = a[2] - b[2];
return dx*dx + dy*dy + dz*dz;
}
I also tried using a macro to avoid the function call but it doesn't help much.
#define DISTANCE_SQUARED(a, b) ((a)[0]-(b)[0])*((a)[0]-(b)[0]) + ((a)[1]-(b)[1])*((a)[1]-(b)[1]) + ((a)[2]-(b)[2])*((a)[2]-(b)[2])
I thought about using SIMD instructions but could not find a good example or complete list of instructions (ideally some multiply+add on two vectors).
GPU's are not an option since only one set of points is known at each function call.
What would be the fastest way to compute the distance squared?