I develop image processing algorithms (using GCC, targeting ARMv7 (Raspberry Pi 2B)).
In particular I use a simple algorithm, which changes index in a mask:
void ChangeIndex(uint8_t * mask, size_t size, uint8_t oldIndex, uint8_t newIndex)
{
for(size_t i = 0; i < size; ++i)
{
if(mask[i] == oldIndex)
mask[i] = newIndex;
}
}
Unfortunately it has poor performance for the target platform.
Is there any way to optimize it?