OpenCV: Contour detection of shadowed image before OCR

Question

I am trying to OCR the picture of documents and my current approach is

Read an image as a grayscale
Binarize it thresholding
Wrap perspective along the contours obtained from cv2.findContours()

The above works well if image is not shadowed. Now I want to get contours of shadowed pictures. My first attempt is to use cv2.adaptiveThreshold for step 2. The adaptive threshold successfully weakened the shadow but the resulted image lost the contrast between the paper and the background. That made cv2 impossible to find contours of the paper. So I need to use other method to remove the shadow.

Is there any way to remove shadow maintaining the background colour?

For reference here is the sample picture I am processing with various approaches. From left, I did

grayscale
thresholding
adaptive thresholdin
normalization

My goal is to obtain the second picture without shadow.

Please note that I actually have a temporary solution specifically to the picture which is to process the part of the picture with shadow separately. Yet, it is not the general solution to shadowed picture as its performance depends on the size, shape and position of a shadow so please use other methods.

This is the original picture.

how about segmenting the receipt using hough (to find the likes defining the boundary) then intersections to find the corners? — Gulzar, Sep 14 '20 at 12:42

score 5 · Answer 1 · answered Sep 14 '20 at 18:49

Here is one way in Python/OpenCV using division normalization, optionally followed by sharpening and/or thresholding.

Input:

import cv2
import numpy as np
import skimage.filters as filters

# read the image
img = cv2.imread('receipt.jpg')

# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# blur
smooth = cv2.GaussianBlur(gray, (95,95), 0)

# divide gray by morphology image
division = cv2.divide(gray, smooth, scale=255)


# sharpen using unsharp masking
sharp = filters.unsharp_mask(division, radius=1.5, amount=1.5, multichannel=False, preserve_range=False)
sharp = (255*sharp).clip(0,255).astype(np.uint8)

# threshold
thresh = cv2.threshold(sharp, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

# save results
cv2.imwrite('receipt_division.png',division)
cv2.imwrite('receipt_division_sharp.png',sharp)
cv2.imwrite('receipt_division_thresh.png',thresh)


# show results
cv2.imshow('smooth', smooth)  
cv2.imshow('division', division)  
cv2.imshow('sharp', sharp)  
cv2.imshow('thresh', thresh)  
cv2.waitKey(0)
cv2.destroyAllWindows()

Division:

Sharpened:

Thresholded:

OpenCV: Contour detection of shadowed image before OCR

1 Answers1

Linked