I have 1000s of PDFs with multiple pages and each PDF has different resolution (based on scanners used to scan them). I want to convert each page of PDF to PNG to pass it to Tesseract for OCR. I used Imagemagick to convert to PNG but have to pass a fixed DPI for all images to get a good readable output. Is there a way I can convert each PDF by preserving the resolution of that PDF too?
For example, if 1.PDF has resolution 622 × 788 and 2.pdf has resolution 792 × 612, I want the exact conversion with same resoultion just a different format(PNG).
The command I am using right now is:
convert -monochrome -density 1200 input.pdf -resize 25% -monochrome -white-threshold 50% -black-threshold -50% output.png
Thanks, pashah