3

I cant run OSD mode in pytesseract on docker image on Ubuntu. On windows, this command works like charm:

pytesseract.image_to_osd(image)

But inside docker image, it causes the following error. What I want to achieve is reading the rotation info using OSD.

File "/usr/local/lib/python3.9/site-packages/pytesseract/pytesseract.py", line 263, in run_tesseract
raise TesseractError(proc.returncode, get_errors(error_string))pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v5.0.0-alpha-20210401 with Leptonica UZN file /tmp/tess__cujlspf loaded. Estimating resolution as 169 UZN file /tmp/tess__cujlspf loaded. Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.')

Tesseract is installed correctly because all other methods like image_to_string are working properly. The suprising thing is that when I call the OSD directly from terminal, it works

tesseract /images/1.jpg  output --psm 0
# cat output.osd
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 5.69
Script: Cyrillic
Script confidence: 0.10

Is there some bug in Pytesseract or any workaround? The rotation info is not included in any other Tesseract methods, only in this OSD. Many thanks

troger19
  • 1,159
  • 2
  • 12
  • 29

1 Answers1

0

I found a solution for this by adding the config arguments to the method call:

pytesseract.image_to_osd(file_name,config='--psm 0 -c min_characters_to_try=5')

This solves the error and I could get the angle data.

NguyenHai
  • 57
  • 1
  • 8