0

I am working on Images of Textbook pages such as questions and handwritten notes and want the binary image for that for few different tasks mainly the OCR. But the problem is that if an image is having a bit of shadow or the brightness level is not continuous, it gives me a lot of black area covering my text.

I used from skimage.filters import try_all_threshold on my images and found that some work well with certain kind of images, others dont. I can not use Local Thresholding where I have to change parameters based on different images because I want to automate the process of OCR.

img_path = DIR+str(11)+'.png'
sk_image = imread(img_path,as_gray=True)

fig,ax = try_all_threshold(sk_image,figsize=(20,15))
plt.savefig('threshold.jpeg',dpi=350)

enter image description here

Why is this black area forming in the image and how can I remove this??

Will a denoising filter such as Bilateral or Gauss would do? If not,please suggest some other technique?

Deshwal
  • 3,436
  • 4
  • 35
  • 94
  • 1
    Please always post your input image separately so others can use it to test. We do not want to have to crop your input from all the other images. Suggestion to use adaptive thresholding or division normalization before thresholding. – fmw42 Aug 27 '20 at 16:01
  • Check my answer here: https://stackoverflow.com/questions/22122309/opencv-adaptive-threshold-ocr/22127181#22127181 – Andrey Smorodov Aug 27 '20 at 16:50
  • now sauvola adaptive binarization is the best for most cases, the fastest solutions is to use it – Demetry Pascal Mar 04 '23 at 21:04

1 Answers1

2

Here is one way to do that in Python/OpenCV using division normalization.

  • Read the input
  • Convert to gray
  • Smooth with Gaussian blur
  • Divide gray image by smoothed image
  • Apply unsharp masking to sharpen
  • Apply Otsu threshold
  • Save results

Input:

enter image description here

import cv2
import numpy as np
import skimage.filters as filters

# read the image
img = cv2.imread('math.png')

# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# blur
smooth = cv2.GaussianBlur(gray, (95,95), 0)

# divide gray by morphology image
division = cv2.divide(gray, smooth, scale=255)

# sharpen using unsharp masking
sharp = filters.unsharp_mask(division, radius=1.5, amount=1.5, multichannel=False, preserve_range=False)
sharp = (255*sharp).clip(0,255).astype(np.uint8)

# threshold
thresh = cv2.threshold(sharp, 0, 255, cv2.THRESH_OTSU )[1] 

# save results
cv2.imwrite('math_division.jpg',division)
cv2.imwrite('math_division_sharp.jpg',sharp)
cv2.imwrite('math_division_thresh.jpg',division)

# show results
cv2.imshow('smooth', smooth)  
cv2.imshow('division', division)  
cv2.imshow('sharp', sharp)  
cv2.imshow('thresh', thresh)  
cv2.waitKey(0)
cv2.destroyAllWindows()

Division image:

enter image description here

Sharpened image:

enter image description here

Thresholded image:

enter image description here

fmw42
  • 46,825
  • 10
  • 62
  • 80
  • But this uses lot of Manual tuining. If I want to run an OCr for thousands of images, it's not a FIT for ALL use cases? no? Like the Gaussian Filter's param and others. – Deshwal Aug 28 '20 at 04:34
  • Yes, it could work, since the gaussian is a large value. Try it for a variety of images and see. – fmw42 Aug 28 '20 at 05:12
  • Okay. Will try to do that. Thanks. Any other suggestions? I'm basically doing this for OCR. – Deshwal Aug 28 '20 at 05:38