0

I want to use OCR on this block of text:

enter image description here

It works well on some lines, but on other lines it doesn't detect anything / gibberish. I'm pretty sure it's because of how the text is skewed, since if I alter the angle of the block just slightly, I get better/worse results for certain lines.

Normally I would use contours to deskew the whole block, however, each line has a different skew. So I thought it would be best to separate each line and then deskew and apply OCR for each line independently. I wanted to use Hough transform to detect the horizontals separating the text lines, but it only seems to detect vertical lines.. Do you have any idea how to fix this or maybe do you have an entirely different idea to deskew the image?

Here's the code for the Hough transform:

def hough_lines2(cvImage):
    img = cvImage.copy()
    # since the input image is already pre-processed, I don't have to perform binarization
    edges= cv2.Canny(img,50,150,apertureSize = 3)
    # I invert the edges since I want to detect lines where there is no text
    # i.e. the space between the text lines
    inv = np.invert(edges)
    # I use the parameter MaxLineGap = 1 since I only want to detect lines where there is no
    # text in the way
    linesP = cv2.HoughLinesP(inv,1,np.pi/180,200,None,150,1)
    # Draw the lines
    img2 = cv2.cvtColor(inv, cv2.COLOR_GRAY2BGR)
    if linesP is not None:
        for i in range(0, len(linesP)):
            l = linesP[i][0]
            x1 = l[0]
            y1 = l[1]
            x2 = l[2]
            y2 = l[3]
            cv2.line(img2, (x1, y1), (x2, y2), (0, 255, 0), 2)
    # Display the lines in the image
    cv2.namedWindow('Resized',cv2.WINDOW_NORMAL)
    cv2.resizeWindow('Resized', 600,900)
    cv2.imshow("Resized", imutils.resize(img2, width=500))
    cv2.waitKey(0)
    return 0

And these are the detected lines: enter image description here

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
anon
  • 11
  • 1
  • 3
  • Almost all the test is detected in my environment – Son of Man Jul 12 '23 at 10:40
  • Some of the text is not clear and that is why tesseract ocr doesn't recognize those characters. – Son of Man Jul 12 '23 at 10:40
  • this line separation is supposed to be done by OCR, not by you or opencv. – Christoph Rackwitz Jul 12 '23 at 10:47
  • Try converting the image to a binarybversion where the white pixels become black and the black pixels become white and then inject the image to OCR – Son of Man Jul 12 '23 at 10:49
  • deskewing that receipt is not for beginners. you'd best skip that. – Christoph Rackwitz Jul 12 '23 at 10:49
  • @VibrantWaves I've tried inverting the image and then using pytesseract.image_to_string(cv2.cvtColor(product_img, cv2.COLOR_BGR2RGB), config="--psm 4") but it only detects about half of the lines.. – anon Jul 12 '23 at 12:48
  • Are you constrained to use only python? You can try Emgu Cv to process the image with C# and Tesseract OCR which will result on at least 70 percent of the text on the image being read – Son of Man Jul 12 '23 at 13:41
  • I only know how to call the open cv functions from C# tool and am following the image and will try to detect the text on the image using Visual Studio then am gonna share an imgur image link – Son of Man Jul 12 '23 at 13:42
  • @VibrantWaves I think I fixed it, the problem was the configuration I was using. I've switched from psm 4 to psm 6 and it works just fine, thanks though! – anon Jul 12 '23 at 15:46
  • Your welcome.glad you found a work around – Son of Man Jul 12 '23 at 15:54

0 Answers0