34

I'm using a litte script to convert PDF to JPG. That works but the quality is very poor.

The script:

$im = new imagick( 'document.pdf[ 0]' ); 
$im->setImageColorspace(255); 
$im->setResolution(300, 300);
$im->setCompressionQuality(95); 
$im->setImageFormat('jpeg'); 
$im->writeImage('thumb.jpg'); 
$im->clear(); 
$im->destroy();

One more thing, I want to keep the original size of the PDF but the conversion crops the size of the JPG.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
Leon van der Veen
  • 1,652
  • 11
  • 42
  • 60

6 Answers6

55

It can be done using setResolution, but you need to do it before loading an image. Try something like this:

// instantiate Imagick 
$im = new Imagick();

$im->setResolution(300,300);
$im->readimage('document.pdf[0]'); 
$im->setImageFormat('jpeg');    
$im->writeImage('thumb.jpg'); 
$im->clear(); 
$im->destroy();
darronz
  • 903
  • 9
  • 17
wojtek
  • 686
  • 6
  • 8
  • 5
    It seems that in some cases ImageMagick requires Ghostscript to be installed, because otherwise it will throw a Postscript delegate failed error – Zsolti Nov 06 '13 at 12:13
  • 1
    Why in the world do setResolution and setImageResolution do different things and have the same description in the docs?! Thank you, you totally saved me. – Hissvard Aug 31 '17 at 14:01
  • If you experience transparency problems when converting PDF to JPEG (black background), try flattening your file: $imagick = $imagick->flattenImages(); – Mario Kurzweil Dec 13 '22 at 13:10
7

The quality of the image produced from the PDF can be changed by setting the density (which is the DPI) before reading in the PDF - this gets past to ghostscript (gs) underneath which rasterizes the PDF. To get a good result, supersample at double the density you require, and use resample to get back to the desired DPI. Remember to change the colorspace to RGB if you want an RGB JPEG.

A typical command line version for convert might be:

convert -density 600 document.pdf[0] -colorspace RGB -resample 300 output.jpg

If you need to crop it, a -shave command following the resample is usually sensible, if the image is centred within the page.

As for the PHP IMagick extension, well, I never personally use it - so am unsure of how you specify file reading hints to it, but I would hope it is possible.

Orbling
  • 20,413
  • 3
  • 53
  • 64
4
$im = new imagick();

//this must be called before reading the image, otherwise has no effect

$img->setResolution(200,200);

//read the pdf

$img->readImage("{$pdf_file}[0]");
whoan
  • 8,143
  • 4
  • 39
  • 48
user4341845
  • 49
  • 1
  • 1
2

To convert a multi-page PDF to JPG files, you may

  1. First check the number of pages (thru getNumberImages())
  2. Use a loop to generate the jpg file for each page of the pdf

Make sure that setResolution is carried out before loading the pdf file by readimage()

<?php

$file="./git.pdf";
$im = new Imagick($file);

$noOfPagesInPDF = $im->getNumberImages(); 
 
      if ($noOfPagesInPDF) { 
 
          for ($i = 0; $i < $noOfPagesInPDF; $i++) { 
              $url = $file.'['.$i.']'; 
              $image = new Imagick();
              $image->setResolution(300,300);
              $image->readimage($url);
              $image->setImageFormat("jpg"); 
              $image->writeImage("./".($i+1).'-'.rand().'.jpg'); 
          } 
          echo "All pages of PDF converted.";
      }
 ?>

Note: I write this answer because the accepted one does not include the iteration for processing multiple pages of a pdf file.

Ken Lee
  • 6,985
  • 3
  • 10
  • 29
0

Try this:

HTML

<html>
 
  <body>
 
    <form action="ConvertPdfToImg.php" enctype="multipart/form-data" method="post" name="f1">
 
      <input id="templateDoc" name="templateDoc" type="file" />
 
      <input type="submit" />
 
    </form>
 
  </body>
 
</html>

PHP

$pdfAbsolutePath = __DIR__."/test.pdf";
 
if (move_uploaded_file($_FILES['templateDoc']["tmp_name"], $pdfAbsolutePath)) {
 
      $im             = new imagick($pdfAbsolutePath);
 
      $noOfPagesInPDF = $im->getNumberImages(); 
 
      if ($noOfPagesInPDF) { 
 
          for ($i = 0; $i < $noOfPagesInPDF; $i++) { 
 
              $url = $pdfAbsolutePath.'['.$i.']'; 
 
              $image = new Imagick($url);
 
              $image->setImageFormat("jpg"); 
 
              $image->writeImage(__DIR__."/".($i+1).'-'.rand().'.jpg'); 
 
          }
 
          echo "All pages of PDF is converted to images";
 
      }
      echo "PDF doesn't have any pages";
 
}
LobsterBaz
  • 1,752
  • 11
  • 20
Sanjay Kumar N S
  • 4,653
  • 4
  • 23
  • 38
  • 1
    You totally missed the point and all this request handling is absolutely unrelated to the question. And you should always say clearly that you are pointing to your own blog. – Victor Schröder Apr 26 '17 at 10:02
  • Though this upload file and check pages answer was not the one that fits the question. It did help me with the question of “how to load a pdf from s3 into memory and then look for page 1” – tristanbailey Jan 01 '21 at 19:04
0

Ensure that the PDF is created with the correct colour profiles, I once had my JPG being very washed out after resizing due to source file was created with wrong colour profile. See also: ImageMagick PDF to JPEG conversion results in green square where image should be

Community
  • 1
  • 1
HoleInVoid
  • 39
  • 6