How to enhance Tesseract automatic text rotation capabilities for OCR?

Question

I have a set of PIL images, where some pages are correctly rotated, while others have a rotation close to 180°. This means that automatic orientation detection may fail as instead of 178° degrees recognizes a 2° degrees orientation.

Unfortunately, Tesseract sometimes cannot understand the difference between 2° orientation and 178°, so in the latter case, the output is completely wrong.

A simple im.rotate(180) automatically fixes this, but the step is manual, and I would like tesseract to automatically understand whether the text is upside-down or not. Looking at some approaches they require the Hough transform for understanding of the prevalent orientation in the document. In this case, however, they may fail, because of the peculiar orientation of these scanned documents.

What options for automatic rotation are available, without reyling on third party scripts, but staying within Python libraries?

score 5 · Answer 1 · answered Jul 08 '20 at 18:22

I am new to StackOverflow, so please forgive me for any kind of misleadings or incorrect answers. In case if anyone is still looking for an answer, pytesseract's image_to_osd function gives the information about the orientation. It only determines the orientation as 0°,90°,180° or 270° i.e. it accurately determines the orientation if the text is aligned along with the axes. But it also outputs any of those four angles even for a different orientation.

So if you are working with minute angle differences like 2° or so, this should solve the issue. So first we align the text and then use the function.

here is the code in python:

while True:
    osd_rotated_image = pytesseract.image_to_osd(image)

    # using regex we search for the angle(in string format) of the text
    angle_rotated_image = re.search('(?<=Rotate: )\d+', osd_rotated_image).group(0)

    if (angle_rotated_image == '0'):
        image = image
        # break the loop once we get the correctly deskewed image
        break
    elif (angle_rotated_image == '90'):
        image = rotate(image,90,(255,255,255)) # rotate(image,angle,background_color)
        continue
    elif (angle_rotated_image == '180'):
        image = rotate(image,180,(255,255,255))
        continue
    elif (angle_rotated_image == '270'):
        image = rotate(image,90,(255,255,255))
        continue

And to align the text deskew python library is the best in my opinion.

Thank you.

which rotate function you imported for this code? i tried from skimage.transform import rotate and it gives this error while using your code : TypeError: Cannot handle this data type: (1, 1, 3), — Mobassir Hossen, Oct 11 '21 at 12:43
[rotate function](https://github.com/TarunChakitha/OCR/blob/master/OCR.py#L166) which I have taken from an [example code to use deskew lib](https://github.com/sbrunner/deskew) — Tarun Chakitha, Oct 12 '21 at 06:39

How to enhance Tesseract automatic text rotation capabilities for OCR?

1 Answers1