Faster iteration on for loop with 2d arrays

Question

I have a problem with optimization to compute errors for disparity map estimation.

To compute errors I create a class with called methods for each error. I need to iterate for every pixel to get an error. This arrays are big cause I'm iterating in size of 1937 x 1217 images. Do you know how to optimize it?

Here is code of my method:

EDIT:

def mreError(self):
    s_gt = self.ref_disp_norm
    s_all = self.disp_bin
    s_r = self.disp_norm 

    s_gt = s_gt.astype(np.float32)
    s_r = s_r.astype(np.float32)
    n, m = s_gt.shape
    all_arr = []

    for i in range(0, n):
        for j in range(0, m):

            if s_all[i,j] == 255:
                if s_gt[i,j] == 0:
                    sub_mre = 0
                else:   
                    sub_mre = np.abs(s_gt[i,j] - s_r[i,j]) / s_gt[i,j]
                all_arr.append(sub_mre)

    mre_all = np.mean(all_arr)
    return mre_all

You could do the substraction and difference directly on the complete arrays instead doing it within a loop. Replace all instance where one array is 0 with fancy indexing. — Dschoni, Mar 08 '21 at 13:07
Please read the [FAQ](https://stackoverflow.com/help) on how to use the site. This isn't a forum with a cascade of posts. If you have a comment to make to someone's post, comment under there, don't post an answer. Any information that will be relevant to all should be edited in your question. You've claimed some of the answers provide the wrong result, but you haven't provided an input of your own. We generated synthetic data and it works just fine. Please read how to create a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). — Reti43, Mar 08 '21 at 15:30

Dschoni · Answer 1 · 2021-03-08T13:39:55.803

You could simply use array operators instead applying them to every element inside a for loop:

import numpy as np

# Creating 2000x2000 Test-Data
s_gt = np.random.randint(0,2,(2000,2000)).astype(np.float32)
s_r = np.random.randint(0,2,(2000,2000)).astype(np.float32)
s_all = np.random.randint(0,256,(2000,2000)).astype(np.float32)


def calc(s_gt, s_r, s_all):
    n, m = s_gt.shape
    all_arr = []
    for i in range(0, n):
        for j in range(0, m):
            if s_gt[i,j] == 0:
                sub_mre = 0
            else:   
                sub_mre = np.abs(s_gt[i,j] - s_r[i,j]) / s_gt[i,j]
    
            if s_all[i,j] == 255:
                all_arr.append(sub_mre)
    
    mre_all = np.mean(all_arr)
    return mre_all

def calc_optimized(s_gt, s_r, s_all):
    sub_mre = np.abs((s_gt-s_r)/s_gt)
    sub_mre[s_gt==0] = 0
    return np.mean(sub_mre[s_all == 255])

When I test the speed of the two different approaches:

%time calc(s_gt, s_r, s_all)
Wall time: 27.6 s
Out[53]: 0.24686379928315413

%time calc_optimized(s_gt, s_r, s_all)
Wall time: 63.3 ms
__main__:34: RuntimeWarning: divide by zero encountered in true_divide
__main__:34: RuntimeWarning: invalid value encountered in true_divide
Out[54]: 0.2468638

Yes your optimization is much faster, but on my data the optimized result is wrong. I don't know what I'm doing wrong. — Sagocz, Mar 08 '21 at 13:57

score 1 · Accepted Answer · answered Mar 08 '21 at 13:38

1

A straight up vectorisation of your method would be

def method_1(self):
    # get s_gt, s_all, s_r
    sub_mre = np.zeros((s_gt.shape), dtype=np.float32)
    idx = s_gt != 0
    sub_mre[idx] = np.abs((s_gt[idx] - s_r[idx]) / s_gt[idx])
    return np.mean(sub_mre[s_all == 255])

But since you're doing your averaging only for pixels where s_all is 255, you could also filter for those first and then do the rest

def method_2(self):
    idx = s_all == 255
    s_gt = s_gt[idx].astype(np.float32)
    s_r = s_r[idx].astype(np.float32)
    sub_mre = np.zeros_like(s_gt)
    idx = s_gt != 0
    sub_mre[idx] = np.abs((s_gt[idx] - s_r[idx]) / s_gt[idx])
    return np.mean(sub_mre)

Personally, I would favour the first method unless the second one results in a much faster result. Calling the function only once and spending, for example, 40 ms vs 5 ms is not noticeable and the readability of the function matters more.

answered Mar 08 '21 at 13:38

Reti43

9,656
3
28
44

Nice, your method_2 is correct. I get expected out value! Reti43: 1.0188183 | Sagocz: 0.11468831 - method_1 Reti43: 0.11468831 | Sagocz: 0.11468831 - method_2 – Sagocz Mar 08 '21 at 14:07
@Sagocz Read my comment under your question about providing a relevant input expected output. I see no reason why either of my two methods should differ from each other. I've tested both along with yours on random data and they all agree. Unless you provide an input where their answers diverge, I can't investigate why this happens. – Reti43 Mar 08 '21 at 15:46
Sorry for that. I wrote the code again and also 1st method is correct. Also I want to apologize for being silence, but I don't want to write an answer to your question before test 1st method again. – Sagocz Mar 11 '21 at 07:49

score -3 · Answer 3 · answered Mar 08 '21 at 14:15

-3

You can just make an image grey (this will speed up calculations substantially) Go check this link how you can do it.

answered Mar 08 '21 at 14:15

Trew laz

29
4

1

There is nothing in the question that implies the calculations are done over 3 colour channels. The issue is in using python-level loops instead of numpy vectorisation. – Reti43 Mar 08 '21 at 15:36

Faster iteration on for loop with 2d arrays

3 Answers3