1

I have several scanned images I would like to compute with Python/Opencv. Each of these images (see an example below) contains n rows of coloured squares. Each of these squares have the same size. The goal is to crop each of these squares and to extract the data from it.

Image with squares to extract

I have found there a code which is able to extract squares from an image.

Here is my code where I have used it :

import numpy as np
import cv2
from matplotlib import pyplot as plt

def angle_cos(p0, p1, p2):
    import numpy as np

    d1, d2 = (p0-p1).astype('float'), (p2-p1).astype('float')
    return abs( np.dot(d1, d2) / np.sqrt( np.dot(d1, d1)*np.dot(d2, d2) ) )

def find_squares(img):
    import cv2 as cv
    import numpy as np

    img = cv.GaussianBlur(img, (5, 5), 0)
    squares = []
    for gray in cv.split(img):
        for thrs in range(0, 255, 26):
            if thrs == 0:
                bin = cv.Canny(gray, 0, 50, apertureSize=5)
                bin = cv.dilate(bin, None)
            else:
                _retval, bin = cv.threshold(gray, thrs, 255, cv.THRESH_BINARY)
            contours, _hierarchy = cv.findContours(bin, cv.RETR_LIST, cv.CHAIN_APPROX_SIMPLE)
            for cnt in contours:
                cnt_len = cv.arcLength(cnt, True)
                cnt = cv.approxPolyDP(cnt, 0.02*cnt_len, True)
                if len(cnt) == 4 and cv.contourArea(cnt) > 1000 and cv.isContourConvex(cnt):
                    cnt = cnt.reshape(-1, 2)
                    max_cos = np.max([angle_cos( cnt[i], cnt[(i+1) % 4], cnt[(i+2) % 4] ) for i in range(4)])
                    if max_cos < 0.1:
                        squares.append(cnt)
    print(len(squares))
    return squares

img = cv2.imread("test_squares.jpg",1)

plt.axis("off")
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()

squares = find_squares(img)
cv2.drawContours( img, squares, -1, (0, 255, 0), 1 )
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()

However, it finds two many squares (100 instead of 15 !!). Looking at the image, it seems that Opencv find a lot of contours for each square.

I'm pretty sure that it can be optimized since the squares have more or less the same size and far from each other. As a very beginner in Opencv, I haven't found yet a way to give more criteria in the function "find squares" in order to get only 15 squares at the end of the routine. Maybe the contour area can be maximized ?

I have also found there a more detailed code (very close to the previous one) but it seems to be developed in a old version of Opencv. I haven't managed to make it work (and so to modify it).

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
Julien M.
  • 35
  • 1
  • 8

2 Answers2

2

This is another more robust method.

I used this code to find the contours in the image (the full code can be found in this gist):

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Define square size
min_square_size = 987
# Read Image
img = cv2.imread('/home/stephen/Desktop/3eY0k.jpg')
# Threshold and find edges
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Threshold the image - segment white background from post it notes
_, thresh = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY_INV);
# Find the contours
_, contours, _ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

I iterated through the contours. I only looked at the contours that were a reasonable size. I found the four corners of each contour.

corners

# Create a list for post-it images
images = []
# Iterate through the contours in the image
for contour in contours:
    area = cv2.contourArea(contour)
    # If the contour is not really small, or really big
    h,w = img.shape[0], img.shape[1]
    if area > min_square_size and area < h*w-(2*(h+w)):
        # Get the four corners of the contour
        epsilon = .1 * cv2.arcLength(contour, True)
        approx = cv2.approxPolyDP(contour, epsilon, True)
        # Draw the point
        for point in approx: cv2.circle(img, tuple(point[0]), 2, (255,0,0), 2)
        # Warp it to a square
        pts1 = np.float32(approx)
        pts2 = np.float32([[0,0],[300,0],[300,300],[0,300]])
        M = cv2.getPerspectiveTransform(pts1,pts2)
        dst = cv2.warpPerspective(img,M,(300,300))
        # Add the square to the list of images
        images.append(dst.copy())

The post-it notes are squares, but because the camera warps the objects in the image they do not appear as squares. I used warpPerspective to make the post-it notes square shapes. Only a few of them are shown in this plot (there are more that didn't fit): plot

Stephen Meschke
  • 2,820
  • 1
  • 13
  • 25
  • Thanks you very much for this answer. I have to test it but it seems to answer well my problem. Would it be possible that you edit your answer to have a complete code (with the images displaying part) ? As a very beginner in OpenCV, that would help me and probably also other users ! Thanks ! – Julien M. Apr 25 '19 at 08:39
  • I have now read your code carefully. If you don't mind, I have two questions. 1) Why have you added `-9000` in the test `area < img.shape[0]*img.shape[1]-9000` ? (`9000` is very small compare to `img.shape[0]*img.shape[1]`, right ?) 2) How did you choose the value `300` in the image warping part ? It seems that `150*150` is much closer to the image size in pixels. I have found [here](https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/) a detailed description on wrapping image where the height and width of the image are calculated... – Julien M. Apr 25 '19 at 14:53
  • Hi @J.Martin. I have added a link to the full code in the top of the answer. Answer 1) That part of the code gets rid of the contour that borders the entire image. I changed the '9000' t0 '2*(h+w)'. Answer 2) I choose 300 because I though that resolution would be easy to see. If the post it's are only 150px, they are too small. – Stephen Meschke Apr 25 '19 at 16:07
  • The reader should note that the new version of OpenCV gives back now two elements (instead of three). The lines `_, contours, _ = cv2.findContours(...)` should be replaced by `contours, _ = cv2.findContours(...)` – Julien M. Apr 26 '19 at 09:56
  • Oh ! I have just realized that when I have tested you program yesterday, I have skipped a large part of your program ! I didn't notice it because you compute the `contours` in two different ways in you program. And, in my copy-paste, I have only selected the first method with the Canny algorithm. Today, I have checked that the selection iteration of post it's doesn't work with this 1st method but works with the second one. But I don't understand why... And if I'm true, why have you kept these two lines `canny = ...` `_, contours, _ = cv2.findCont...` since they don't give the good `contours`? – Julien M. Apr 26 '19 at 11:49
  • @J.Martin I copied and pasted that code from another S.O. answer (I have included the link at the top of this answer). That code was more robust, but not as easy to understand. I simplified the code in this post. It is not less robust, but it will work well if the background is extremely white. – Stephen Meschke Apr 26 '19 at 16:07
0

If your problem is that too many contours (edges) are found in the image, my suggestion is to modify the edge-finding part first. It'll be by far the easiest modification to make.

In particular, you'll need to change this call:

bin = cv.Canny(gray, 0, 50, apertureSize=5)

The cv.Canny() function takes as arguments two threshold values, the aperture size, and a boolean to indicate whether a precise form of gradient is used. Play with those parameters, and my guess is, you'll get much better results.

PlinyTheElder
  • 1,454
  • 1
  • 10
  • 15
  • Thanks for this answer. I will try to do this. However, I guess that I can't be sure to get the 15 squares all the time I will analyse an image – Julien M. Apr 25 '19 at 08:33