Skipping irrelevant contours for digit recognition using openCV

Question

Extra contours getting populated:

I am using the following code to perform the contouring on a given image

image = cv.imread('/content/drive/My Drive/Colab Notebooks/digit-recognition/test-2.jfif')
grey = cv.cvtColor(image, cv.COLOR_BGR2GRAY)
grey = cv.GaussianBlur(grey,(5,5),0)
thresh = cv.adaptiveThreshold(grey,255,cv.ADAPTIVE_THRESH_MEAN_C,cv.THRESH_BINARY_INV,11,2)
contours, hierarchy = cv.findContours(thresh, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)

preprocessed_digits = []
for c in contours:
    x,y,w,h = cv.boundingRect(c)
    
    cv.rectangle(image, (x,y), (x+w, y+h), color=(0, 255, 0), thickness=2)
    digit = thresh[y:y+h, x:x+w]
    resized_digit = cv.resize(digit, (18,18))
    padded_digit = np.pad(resized_digit, ((5,5),(5,5)), "constant", constant_values=0)
    plt.imshow(padded_digit, cmap="gray")
    plt.show()
    xdigit = padded_digit.reshape(1,784)
    prediction = neigh.predict(xdigit)
    print("prediction = ",prediction[0])
print("\n\n\n----------------Contoured Image--------------------")
plt.imshow(image, cmap="gray")
plt.show()

This is the image I am using

How can I skip the unnecessary contours?
If I don't use Adaptive Thresholding then the contours are not detected at all properly due to light effects in this image. Although this contouring is
good as it detects the letters properly only thing is that it detects the noise areas too.

Experiments:
I changed the blockSize in adaptive thresholding to 3 and the contouring appeared perfect:

Now I gave a different image with the same it produced the following contours

It is like it's making contours inside contours. That's a little confusing
because I thought RETR_EXTERNAL will prevent that.

Another example:

The contouring for this appears fine. But the images come like this

I am not sure if because of the distortion of the image it's getting
predicted wrong.

score 0 · Answer 1 · answered Aug 22 '21 at 12:18

0

The easiest way would be to filter the detected bounding boxes by size, as all the noisy detections seem to be smaller than the ones you are looking for:

for c in contours:
    x,y,w,h = cv.boundingRect(c)    
    if w*h >= 200: 
        cv.rectangle(image, (x,y), (x+w, y+h), color=(0, 255, 0), thickness=2)
        digit = thresh[y:y+h, x:x+w]
        resized_digit = cv.resize(digit, (18,18))
        padded_digit = np.pad(resized_digit, ((5,5),(5,5)), "constant", constant_values=0)
        plt.imshow(padded_digit, cmap="gray")
        plt.show()
        xdigit = padded_digit.reshape(1,784)
        prediction = neigh.predict(xdigit)
        print("prediction = ",prediction[0])

Alternatively you can employ different filtering methods, e.g by defining thresholds on the number of dark pixels that should appear in each bounding box (histogram) or contrast etc.

answered Aug 22 '21 at 12:18

NMme

461
3
12

Thanks Also is there any logic whether to apply Gaussian blur or medianBlur or nothing at all. I found that some images were better with a gaussian blur some with medianBlur and some without any blur. – sunny Aug 22 '21 at 12:25
I changed the adaptive thresholding parameter to `thresh = cv.adaptiveThreshold(grey,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,cv.THRESH_BINARY_INV,3,2)` And the contouring was perfect. But when I tried with a different image there were some issues. In the above I changed the blockSize to 3 – sunny Aug 22 '21 at 12:31
1) Which blur works best, depends on the noise you have in your image. When you apply a blur you basically apply a frequency filter (filtering out high frequency components). This means that usually Gaussian Blur works best for removing the kind of gaussian noise you see in camera images. See [here](https://docs.opencv.org/4.5.2/d4/d13/tutorial_py_filtering.html) for a quick overview on OpenCv's blur filters. 2) the blocksize controls how large the pixel neighbourhood is, that is used to determine the local threshold, I think a bit larger would be better in your case – NMme Aug 22 '21 at 14:14
I have added one handwritten image contains 1 2 3 4 5 6 7 8 9 The contouring seems to be working fine. But if you see when I am looking at the individual digits which I extract in the for a loop they appear a little blurred so the 3 got predicted as 1. I am not sure if that's a problem with my model or something I could do with the image – sunny Aug 22 '21 at 15:31

Skipping irrelevant contours for digit recognition using openCV

1 Answers1