9

I have lots of scanned images of handwritten digit inside a rectangle(small one).

enter image description here

Please help me to crop each image containing digits and save them by giving the same name to each row.

import cv2

img = cv2.imread('Data\Scan_20170612_4.jpg')

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
edged = cv2.Canny(gray, 30, 200)

_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

i = 0
for c in contours:
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.09 * peri, True)

    if len(approx) == 4:
        screenCnt = approx
        cv2.drawContours(img, [screenCnt], -1, (0, 255, 0), 3)
        cv2.imwrite('cropped\\' + str(i) + '_img.jpg', img)

        i += 1
utkarsh
  • 333
  • 1
  • 3
  • 11

3 Answers3

7

Here is My Version

import cv2
import numpy as np

fileName = ['9','8','7','6','5','4','3','2','1','0']

img = cv2.imread('Data\Scan_20170612_17.jpg')

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)

kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(gray,kernel,iterations = 2)
kernel = np.ones((4,4),np.uint8)
dilation = cv2.dilate(erosion,kernel,iterations = 2)

edged = cv2.Canny(dilation, 30, 200)

_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

rects = [cv2.boundingRect(cnt) for cnt in contours]
rects = sorted(rects,key=lambda  x:x[1],reverse=True)


i = -1
j = 1
y_old = 5000
x_old = 5000
for rect in rects:
    x,y,w,h = rect
    area = w * h

    if area > 47000 and area < 70000:

        if (y_old - y) > 200:
            i += 1
            y_old = y

        if abs(x_old - x) > 300:
            x_old = x
            x,y,w,h = rect

            out = img[y+10:y+h-10,x+10:x+w-10]
            cv2.imwrite('cropped\\' + fileName[i] + '_' + str(j) + '.jpg', out)

            j+=1
utkarsh
  • 333
  • 1
  • 3
  • 11
4

That's an easy thing if u try. Here's my output- (The image and its one small bit)

enter image description here

What i did?

  1. Resized the image first because it was too big in my screen
  2. Erode, Dilate to remove small dots and thicken the lines
  3. Threshold the image
  4. Flood fill, beginning at the right point
  5. Invert the flood fill
  6. Find contours and draw one at a time which are in range of approximately the area on the rectangle. For my resized (500x500) image i put Area of contour in range 500 to 2500 (trial and error anyway).
  7. Find bounding rectangle and crop that mask from main image.
  8. Then save that piece with proper name- which i didn't do.

    Maybe, there's a simpler way, but i liked this. Not putting the code because i made it all clumsy. Will put if u still need it.

    Here's how the mask looks when you find contours each at a time

enter image description here

code:

import cv2;
import numpy as np;

# Run the code with the image name, keep pressing space bar

# Change the kernel, iterations, Contour Area, position accordingly
# These values work for your present image

img = cv2.imread("your_image.jpg", 0);
h, w = img.shape[:2]
kernel = np.ones((15,15),np.uint8)

e = cv2.erode(img,kernel,iterations = 2)  
d = cv2.dilate(e,kernel,iterations = 1)
ret, th = cv2.threshold(d, 150, 255, cv2.THRESH_BINARY_INV)

mask = np.zeros((h+2, w+2), np.uint8)
cv2.floodFill(th, mask, (200,200), 255); # position = (200,200)
out = cv2.bitwise_not(th)
out= cv2.dilate(out,kernel,iterations = 3)
cnt, h = cv2.findContours(out,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
for i in range(len(cnt)):
            area = cv2.contourArea(cnt[i])
            if(area>10000 and area<100000):
                  mask = np.zeros_like(img)
                  cv2.drawContours(mask, cnt, i, 255, -1)
                  x,y,w,h = cv2.boundingRect(cnt[i])
                  crop= img[ y:h+y,x:w+x]
                  cv2.imshow("snip",crop )
                  if(cv2.waitKey(0))==27:break

cv2.destroyAllWindows()
I.Newton
  • 1,753
  • 1
  • 10
  • 14
0
_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

you are using cv2.RETR_LIST to find contours in the image. For your image to get better output use cv2.RETR_EXTERNAL. Before using that first remove black border line from the image.

cv2.RETR_LIST gives you list of all contours for image

cv2.RETR_EXTERNAL gives you only external or outer contours, not internal contours

change line to

 _, contours, hierarchy = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

Contours Hierarchy

Kallz
  • 3,244
  • 1
  • 20
  • 38