Questions tagged [tesseract-5.x]
9 questions
1
vote
0 answers
Tesseract training error: Bad box coordinates in boxfile string
I have prepared the following ground truth files:
../tesstrain/data/Chechen-ground-truth
|-- 1.box
|-- 1.gt.txt
|-- 1.png
|-- 10.box
|-- 10.gt.txt
|-- 10.png
|-- 11.box
|-- 11.gt.txt
|-- 11.png
|-- 12.box
|-- 12.gt.txt
|-- 12.png
The box files are…

khashashin
- 1,058
- 14
- 40
0
votes
0 answers
Tesseract 5 on debian shows text plus gibberish on debian 11
i have installed latest peppermint os based on debian 12 but this version has tesseract 5 and it gives output together with some jibberish text. Actually the empty spaces between paragraphs and lines are filled with gibberish text.
I have also…

Real Bezo
- 1
- 1
0
votes
0 answers
tesseract can't parse my image with text and number
I want to get text and number in this football scoreboard image with tesseract but didn't work. I have tried these configs but not found an expected result
Use grayscale image
tesseract -psm mode 1 -> 13
One of the results but not relevant much:…

Thong Nguyen
- 143
- 3
- 10
0
votes
1 answer
Tesseract Training - Error reading radical code table data/langdata/radical-stroke.txt
I've tried to train Tesseract OCR on specific font, based on polish language model (pol) and my own "ground truth" text - it may be important, that the one generated by me does not contain all chars from polish charset, because in my application of…

JP JP
- 1
- 2
0
votes
2 answers
How to detect digits from image by using Tesseract 5?
I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?
My environment:
Windows 11 22H2
WSL2 Ubuntu 22.04.1LTS
tesseract…

Static
- 15
- 7
0
votes
0 answers
Tesseract5-OCR Train - Segmentation fault error
I am trying to train tesseract 5 on a new font. Am running tesseract on WSL Ubuntu and I followed tutorial by Gabriel Garcia and the official tesseract Compilation docs. Am trying to train tesseract on top of the eng.traineddata file from…

Algocoder
- 1
- 1
0
votes
0 answers
Trying training tesseract 5.3.0, "Couldnt find a matching blob" for every zero in training data. Other characters okay
Tesseract 5.3.0.20221222
When using command
tesseract.exe 1.png 1 box.train
I get the output
row xheight=25, but median xheight = 16
row xheight=25.5, but median xheight = 16
row xheight=25.5, but median xheight = 16
row xheight=25, but median…

Maximus
- 21
- 4
0
votes
0 answers
Tesseract Command Line Stdin Multiple Buffer
Is it possible to pass an array of Buffer's thru stdin to tesseract command-line to process them at once? This works with existing files yet not with buffers.

Pure
- 51
- 6
0
votes
0 answers
Tesseract adds unnecessary space in words, and interprets I as 1
I use Tesseract
tesseract 5.3.0-rc1-2-gf2519
leptonica-1.82.0
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.4) : libpng 1.6.39 : libtiff 4.4.0 : zlib 1.2.11 : libwebp 1.2.4 : libopenjp2 2.5.0 and testdata_best
I am trying out some OCR using…

ken4ward
- 2,246
- 5
- 49
- 89