Not able to process some images for face detection using mtcnn mehtod, implemented using mxnet and python

Question

I am trying to process a bunch of images for face recognition.

I have several sets of images which I am trying to process, some of them processed fine but in a particular set there are some images which are not able to process and gives this particular error: could not broadcast input array from shape

I am using MTCNN which is implemented using mxnet and python here is the link to the original repo.

This error comes in the second stage of the mtcnn detector, here is the code:

#############################################
# second stage
#############################################
        num_box = total_boxes.shape[0]

        # pad the bbox
        [dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = self.pad(total_boxes, width, height)
        # (3, 24, 24) is the input shape for RNet
        input_buf = np.zeros((num_box, 3, 24, 24), dtype=np.float32)

        for i in range(num_box):
            tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
            tmp[dy[i]:edy[i]+1, dx[i]:edx[i]+1, :] = img[y[i]:ey[i]+1, x[i]:ex[i]+1, :]
            input_buf[i, :, :, :] = adjust_input(cv2.resize(tmp, (24, 24)))

        output = self.RNet.predict(input_buf)

        # filter the total_boxes with threshold
        passed = np.where(output[1][:, 1] > self.threshold[1])
        total_boxes = total_boxes[passed]

        if total_boxes.size == 0:
            return None

In the for loop when it tries to change the shape it throws an error.

If anyone needs to view more code or want the exact pic do let me know.

Usually, broadcast exception happens when it tries to do an operation using 2 tensors, and the shape of the second one cannot be broadcasted (meaning, "extending to match") to the shape of the first one. Try to print out shapes of the tensors in the loop to see which are they and then you can figure out why broadcasting is not possible. Here is more about what broadcasting is and its limitations: https://gluon.mxnet.io/chapter01_crashcourse/ndarray.html?highlight=broadcast#Broadcasting — Sergei, Feb 14 '19 at 20:24

score 0 · Answer 1 · edited Apr 10 '19 at 08:42

The following answer is given by juliojj in https://www.ctolib.com/article/comments/20418:

Thanks a lot for sharing your code. I am sharing with you how I managed to fix two errors when running your code. Maybe it can help who face the same problems. For some reason, just two images in my dataset generated these errors, on file mtcnn_detector.py:

error 1) could not broadcast input array from shape (0,63,3) into shape (58,63,3), where "63" can be any other value...

error 2) tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8) ValueError: negative dimensions are not allowed.

The two new conditions below fixed (temporary, as I didn't track the source of it) the problem:

(line 280 in mtcnn_detector.py)

# pad the bbox 
[dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = self.pad(total_boxes, width, height)
# (3, 24, 24) is the input shape for RNet
input_buf = np.zeros((num_box, 3, 24, 24), dtype=np.float32)

for i in range(num_box):
    if(tmph[i]>0):         # << WARNING (bug fixed)
        tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
        if(edy[i]>=0):     # << WARNING (bug fixed)
            tmp[dy[i]:edy[i]+1, dx[i]:edx[i]+1, :] = img[y[i]:ey[i]+1, x[i]:ex[i]+1, :]
            input_buf[i, :, :, :] = adjust_input(cv2.resize(tmp, (24, 24)))

output = self.RNet.predict(input_buf)

This solved my problem and I believe it will solve yours as well.

Not able to process some images for face detection using mtcnn mehtod, implemented using mxnet and python

1 Answers1