0

i have installed latest peppermint os based on debian 12 but this version has tesseract 5 and it gives output together with some jibberish text. Actually the empty spaces between paragraphs and lines are filled with gibberish text.

I have also peppermint os based on debian 11 on another laptop and it has tesseract 4.1.1 and for the same text i get near perfect result. I use on both machines gimagereader as ocr application but the engine is tesseract.

I don't know how i can solve this problem. You can see the image and its tesseract 5 output below [original image]

[ocr output of my image] i tried to downgrade tesseract but it didn't work becasuse of dependencies

Real Bezo
  • 1
  • 1
  • 1
    This site is for questions related to programming, not for questions about using some software. Looks like a problem resulting from bad quality of the input image. – Bodo Aug 28 '23 at 09:58
  • I have seen people asked questions about tesseract, you might be right about that this is just a problem of a software but i think my problem is not about bad quality of the input image because with the same image using tesseract 4.1.1. gives near perfect output – Real Bezo Aug 28 '23 at 11:20
  • If this is working with one version of Tesseract, but not the other, please file a bug report in their issue tracker – Nico Haase Aug 28 '23 at 11:27

0 Answers0