Questions tagged [tesseract-5.x]

9 questions
1
vote
0 answers

Tesseract training error: Bad box coordinates in boxfile string

I have prepared the following ground truth files: ../tesstrain/data/Chechen-ground-truth |-- 1.box |-- 1.gt.txt |-- 1.png |-- 10.box |-- 10.gt.txt |-- 10.png |-- 11.box |-- 11.gt.txt |-- 11.png |-- 12.box |-- 12.gt.txt |-- 12.png The box files are…
khashashin
  • 1,058
  • 14
  • 40
0
votes
0 answers

Tesseract 5 on debian shows text plus gibberish on debian 11

i have installed latest peppermint os based on debian 12 but this version has tesseract 5 and it gives output together with some jibberish text. Actually the empty spaces between paragraphs and lines are filled with gibberish text. I have also…
Real Bezo
  • 1
  • 1
0
votes
0 answers

tesseract can't parse my image with text and number

I want to get text and number in this football scoreboard image with tesseract but didn't work. I have tried these configs but not found an expected result Use grayscale image tesseract -psm mode 1 -> 13 One of the results but not relevant much:…
Thong Nguyen
  • 143
  • 3
  • 10
0
votes
1 answer

Tesseract Training - Error reading radical code table data/langdata/radical-stroke.txt

I've tried to train Tesseract OCR on specific font, based on polish language model (pol) and my own "ground truth" text - it may be important, that the one generated by me does not contain all chars from polish charset, because in my application of…
JP JP
  • 1
  • 2
0
votes
2 answers

How to detect digits from image by using Tesseract 5?

I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers? My environment: Windows 11 22H2 WSL2 Ubuntu 22.04.1LTS tesseract…
Static
  • 15
  • 7
0
votes
0 answers

Tesseract5-OCR Train - Segmentation fault error

I am trying to train tesseract 5 on a new font. Am running tesseract on WSL Ubuntu and I followed tutorial by Gabriel Garcia and the official tesseract Compilation docs. Am trying to train tesseract on top of the eng.traineddata file from…
0
votes
0 answers

Trying training tesseract 5.3.0, "Couldnt find a matching blob" for every zero in training data. Other characters okay

Tesseract 5.3.0.20221222 When using command tesseract.exe 1.png 1 box.train I get the output row xheight=25, but median xheight = 16 row xheight=25.5, but median xheight = 16 row xheight=25.5, but median xheight = 16 row xheight=25, but median…
Maximus
  • 21
  • 4
0
votes
0 answers

Tesseract Command Line Stdin Multiple Buffer

Is it possible to pass an array of Buffer's thru stdin to tesseract command-line to process them at once? This works with existing files yet not with buffers.
Pure
  • 51
  • 6
0
votes
0 answers

Tesseract adds unnecessary space in words, and interprets I as 1

I use Tesseract tesseract 5.3.0-rc1-2-gf2519 leptonica-1.82.0 libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.4) : libpng 1.6.39 : libtiff 4.4.0 : zlib 1.2.11 : libwebp 1.2.4 : libopenjp2 2.5.0 and testdata_best I am trying out some OCR using…
ken4ward
  • 2,246
  • 5
  • 49
  • 89