I want to use the MMX instruction set to optimize my Linux C program, which does lots of operations on images stored in RGB format (each RGB component is stored in an unsigned char). The operations are trivial: I subtract one image from the other pixel by pixel, and accumulate the sum of the absolute values of the differences. (basically, I have a small image, or pattern, and I'm trying to find if that pattern exists in a larger image).
I know this can be coded in assembly language using the MMX instructions to do the individual byte operations in SIMD fashion. However, is there an easier way? Maybe a library, or a higher-level interface that uses the MMX instructions?