I have a large batch of PDFs that I can't OCR because they've each got a small field of renderable text.
I'm trying to convert them all to TIFF so I can convert back and run OCR, but I'm running into problems invoking the programs that I'd expect to do the job. I installed them without issue, but for some reason, I keep getting errors saying the associated commands don't exist:
c:\Program Files\Python37\Lib\site-packages>pip install tesseract
Requirement already satisfied: tesseract in c:\program files\python37\lib\site-packages (0.1.3)
c:\Program Files\Python37\Lib\site-packages>tesseract --version
'tesseract' is not recognized as an internal or external command,
operable program or batch file.
c:\Program Files\Python37\Lib\site-packages>pip install ghostscript
Requirement already satisfied: ghostscript in c:\program files\python37\lib\site-packages (0.6)
Requirement already satisfied: setuptools in c:\program files\python37\lib\site-packages (from ghostscript) (40.8.0)
c:\Program Files\Python37\Lib\site-packages>gs --version
'gs' is not recognized as an internal or external command,
operable program or batch file.
c:\Program Files\Python37\Lib\site-packages>gswin32c --version
'gswin32c' is not recognized as an internal or external command,
operable program or batch file.
Any ideas what I'm doing wrong?
Bonus points if you've got a better way to perform the overall task.