Disable output of "Page x" in tesseract when using stdout as output

Question

I try to use tesseract for OCR of pictures and I would like to disable the somewhat verbose output of the pages tesseract is scanning:

:~$ tesseract stdin stdout -l eng txt
Page 1
<ocr output>

Is it possible to remove the "Page 1" from the output?

:~$ tesseract --version
tesseract 4.0.0-146-gc39a

@nguyenq that was quiet right (pun intended). Can you answer the question so I can mark it as answered and you earn some points? — Sven Lauterbach, Jan 16 '19 at 20:17

score 4 · Accepted Answer · answered Jan 17 '19 at 16:53

4

Try quiet option at the end of the command.

answered Jan 17 '19 at 16:53

nguyenq

Uh... how? `-quiet`? If I put it at the very end of the command it still prints the page number. If I put it before the filenames, it vomits errors at me, saying it can't open stdout, can't find my file, etc. I don't see "quiet" mentioned in the man page either. – Michael Aug 20 '20 at 21:03
Another way: add `-c debug_file=/dev/null` somewhere before the input/output params. – Speedstone Mar 25 '21 at 20:51

lister_of_smeg · Answer 2 · 2020-04-02T18:42:25.383

0

If you meant you only wanted to see the OCR'd text then just redirect stderr to null.

foo | tesseract - - 2>/dev/null

Or of course, to a log file if you so desire.

edited Apr 02 '20 at 18:42

answered Apr 02 '20 at 18:37

lister_of_smeg

2 Answers2