0

I am using pytesseract wrapper, with Legacy Tesseract (oem 0). This is my code line to extract text from image:

try:
    # extracting ocr data from image
    ocr_data = pytesseract.image_to_data(
        img, lang="eng", output_type=pytesseract.Output.DATAFRAME,
        config="--oem 0"
    )

except Exception as e:
    print("Trace:", e)

Error trace:

Trace: Tesseract Open Source OCR Engine v4.0.1 with Leptonica Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 389 tesseract: intmatcher.cpp:1160: void ScratchEvidence::UpdateSumOfProtoEvidences(INT_CLASS, BIT_VECTOR): Assertion `ClassTemplate->ProtoLengths[ActualProtoNum] < MAX_PROTO_INDEX' failed. Aborted (core dumped)

I have also tried with command line tesseract and getting exactly same error. command used:

tesseract img.png out --oem 0 -l eng

I am using Tessdata files given on this link: https://github.com/tesseract-ocr/tessdata

I searched on google but couldn't find any help!

M Asad Ali
  • 106
  • 7

0 Answers0