0

I need to estimate the background across multiple frames of a stationary video (not-moving camera). I have a number of frames and want to calculate the median for each pixel across all frames (usually 10 to 100 frames). I was able to do that with brute force but the performance is just awful (it takes 30-120 seconds to calculate). In Python with NumPy I can achieve this in a single np.median call:

medianFrames = [im1, im2, im3, im4]
medianFrame = np.median(medianFrames, axis=0).astype(dtype=np.uint8)   

In Objective-C the algorithm is below, very slow because it enumerates each pixel, creates an array for each component (R,G,B), and then calculates median value. It works but it's super slow (uses OpenCV Mat for image manipulation):

   for (int i = 0; i < result.rows; i++) {
        for (int j = 0; j < result.cols; j++) {
            NSMutableArray *elements_B = [NSMutableArray arrayWithCapacity:arr.count];
            NSMutableArray *elements_G = [NSMutableArray arrayWithCapacity:arr.count];
            NSMutableArray *elements_R = [NSMutableArray arrayWithCapacity:arr.count];
            for(int frameIndex = 0; frameIndex < arr.count; frameIndex++) {
                Mat frame = matArray[frameIndex];
                int B = frame.at<Vec3b>(i, j)[0];
                int G = frame.at<Vec3b>(i, j)[1];
                int R = frame.at<Vec3b>(i, j)[2];
                elements_B[frameIndex] = [NSNumber numberWithInt:B];
                elements_G[frameIndex] = [NSNumber numberWithInt:G];
                elements_R[frameIndex] = [NSNumber numberWithInt:R];
            }
            
            NSArray *sortedB = [elements_B sortedArrayUsingSelector:@selector(compare:)];
            NSUInteger middleB = [sortedB count] / 2;
            NSNumber *medianB = [sortedB objectAtIndex:middleB];
            
            result.at<Vec3b>(i,j)[0] = medianB.intValue;
            
            NSArray *sortedG = [elements_G sortedArrayUsingSelector:@selector(compare:)];
            NSUInteger middleG = [sortedG count] / 2;
            NSNumber *medianG = [sortedG objectAtIndex:middleG];
            
            result.at<Vec3b>(i,j)[1] = medianG.intValue;
            
            NSArray *sortedR = [elements_R sortedArrayUsingSelector:@selector(compare:)];
            NSUInteger middleR = [sortedR count] / 2;
            NSNumber *medianR = [sortedR objectAtIndex:middleR];
            
            result.at<Vec3b>(i,j)[2] = medianR.intValue;
        }
    }

The real bottleneck is an enumeration of each pixel across each image and calculating the median value. What is the best way to process multiple images and execute pixel-based math operations efficiently, as NumPy does?

Mando
  • 11,414
  • 17
  • 86
  • 167
  • 2
    Do you want to calculate the mean or the median? – sbooth Apr 30 '22 at 15:52
  • I want to calculate median, let me edit my question to avoid confusion – Mando Apr 30 '22 at 19:35
  • 1
    I'm not familiar with opencv so I'm not posting this as an answer (because there could be a better method built in) but I imagine you could start by eliminating ObjC in the loops (use of `NSArray`, `NSNumber` etc) and drop down to C or C++. Also, instead of operating on each matrix element individually you could get the whole row as a C pointer (using `ptr(i)` and process it that way. – sbooth May 01 '22 at 12:08
  • 1
    As a related solution, if you ever need to calculate the Euclidean norm of corresponding pixels in multiple images you could convert the image stack to a single 3D matrix and call `computeNorm` with `axes = [2] `. Note that you'd also need to convert 8-bit pixel data to 32-bit. vImage can do that for you. BNNS function: https://developer.apple.com/documentation/accelerate/bnns/3783726-computenorm – Flex Monkey May 02 '22 at 06:16

0 Answers0