0

I am using this command to convert pages from a pdf to jpeg images:

magick convert -density 300 sample.pdf output.jpeg

I see a white background and the content of the PDF appears as a smaller image stuck to the bottom left corner of the white "canvas". Can anyone help with why this might be happening and how to prevent this "shrinking"?

My PDF has 14 pages. Here is the metadata for a few of those pages:

>magick identify sample.pdf
sample.pdf[0] PDF 2286x3600 2286x3600+0+0 16-bit sRGB 6458B 0.016u 0:00.017
sample.pdf[1] PDF 2286x3600 2286x3600+0+0 16-bit sRGB 6018B 0.016u 0:00.020
sample.pdf[2] PDF 2286x3600 2286x3600+0+0 16-bit sRGB 5732B 0.016u 0:00.023

And here are the actual and expected outputs for one of the pages:

actual output: Actual output

expected output:

expected output

edit: here is a sample PDF:

https://www.dropbox.com/s/0bzu5brfzbedd7i/sample.pdf?dl=0

  • Post your actual PDF file. You can post to some free hosting service (Zip it if needed) and then put the link in your question. – fmw42 Jan 22 '22 at 20:07
  • @fmw42, added sample pdf – applecodervinegar Jan 22 '22 at 20:46
  • This will not help, but in general it is best to use just `magick` not `magick convert` with ImageMagick 7 otherwise, you will not get the newer ImageMagick 7 features. Syntax will follow ImageMagick 6 if you do not. – fmw42 Jan 22 '22 at 22:37

2 Answers2

2

I am not sure why you have that behavior. There is something in the PDF, perhaps a crop box, that Imagemagick/Ghostscript is not picking up. But you can get rid of the excess white using -trim

magick sample.pdf -trim sample_%d.jpg
fmw42
  • 46,825
  • 10
  • 62
  • 80
  • as a noob to imagemagick I was wondering what is the default sizing used when converting to image? – applecodervinegar Jan 22 '22 at 21:18
  • The default output dimensions for converting PDF to raster is defined by the default density of 72 dpi and the dimensions of the pdf in inches. `dimension in pixels = density * dimension in inches`. Use magick identify -verbose to see the dimensions in inches and density (i.e. resolution) of any file – fmw42 Jan 22 '22 at 22:22
  • The issue is likely as mentioned earlier that the enclosed raster image itself has a different density than the enclosing PDF. You might be better off using pdf2image to remove the raster images from the PDF shell. pdf2image is in the Poppler package. See https://pdf2image.readthedocs.io/en/latest/installation.html – fmw42 Jan 23 '22 at 00:13
1

Thanks for the example

> magick identify sample.pdf
> sample.pdf[0] PDF 2286x3600 

Apears to be wrong as there is no match

from the PDF contents

/Width 1531
/Im0
/Height 2454
/MediaBox [0 0 1531 2454]

on

Page Size:
/CropBox [0 0 919 1473]
919 pt x 1473 pt
32.42 x 51.96 cm
12.76 x 20.45 inches

Therefore no problems when the images were inserted as @ 120 dpi

We can check the image by copy when zoom to 100% in a viewer and paste into say paint, which agrees the image is 1531 x 2454 pixels.

enter image description here

As a result of comments with @fmw42, it was decided to see if GhostScript (which ImageMagick depends on for PDF handling) was having an affect, and certainly processing that PDF using GS v 9.55 without any special switches gave warnings and produced the output below left So the issue seems to be caused by recent GhostScript method of calling/scaling. since using simple GhostScript based image apps (Irfanview using GS plugin on the left) behave the same whilst other viewers have less of a problem even sister product MuPDF as previewed on the right. So the file Media Box as seen and probably used for scaling by Ghostscript seems to be the culprit, but was processed by two other PDF handlers during generation.

enter image description here

One solution would be to use a simpler method of extracting images as PNG thus look at Xpdf command line tools "pdftopng" which gives a good result but you need to calculate that the optimum resolution in this case is 120 (or 240), Typical windows command line does not need .exe but its best to use that when prefixing with a path for use from another location.

pdftopng.exe -r 120 -f 1 -l 1 sample.pdf
K J
  • 8,045
  • 3
  • 14
  • 36