0

I've been searching over StackOverFlow and google for two days, still couldn't come to a solution. What I am trying to do is creating a PHP script that:

  • Takes a PDF uploaded to my website
  • Converts each page of the document into a separate image
  • Displays converted images

Most users that made similar questions are addressed to use ImageMagick but my perfect solution would be a PHP library, do you know any?

Viktor
  • 9
  • 7
  • Find the number of pages in the PDF. Write a "for" or "while" loop over each page in the PDF. Append [#] to the end of the image.pdf for each # from 0 to N-1, where N is the number of pages, e.g. image.pdf[0] for the first page and read that page. Convert page to raster and save the files with whatever suffix you want. – fmw42 May 06 '20 at 21:23

2 Answers2

0

For the first and the third point, they are lot of informations or tutorials on the web.

For the second point, you can use this composer package : https://github.com/spatie/pdf-to-image

  • Yes sure, my concern was about point #2 but I wanted to explain what I was doing. Will check FPDI immediately! – Viktor May 06 '20 at 21:07
0

php-vips can do this quickly and needs only a small amount of memory.

For example:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';

use Jcupitt\Vips;

for ($i = 1; $i < count($argv); $i++) {
  $image = Vips\Image::newFromFile($argv[$i]);
  $n_pages = $image->get("n-pages");
  echo($argv[$i] . " has " . $n_pages . " pages\n");

  for ($n = 0; $n < $n_pages; $n++) {
    echo("  rendering page " . $n . " ...\n");
    $page = Vips\Image::newFromFile($argv[$i], [
      "dpi" => 30,
      "page" => $n,
      # this enables image streaming
      "access" => "sequential"
    ]);
    $page->writeToFile($argv[$i] . "_page_" . $n . ".png");
  }
}

At 30 DPI, an A4 page is about 250 pixels across, which is OK for a preview. On this modest 2015 laptop I see:

$ time ./convert-vips.php ~/pics/nipguide.pdf
/home/john/pics/nipguide.pdf has 58 pages
  rendering page 0 ...
  rendering page 1 ...
...
  rendering page 56 ...
  rendering page 57 ...

real    0m1.765s
user    0m1.645s
sys     0m0.230s

Less than two seconds to render 58 preview pages.

It has the following nice features:

  1. It uses poppler for PDF rendering, not GhostScript, so it can make direct calls to the library. By contrast, packages like imagick use GhostScript and have to process documents via hidden temporary files. This gives a useful speed increase.

  2. Poppler will generate high-quality, anti-aliased images. With GhostScript, you need to render at a higher resolution and then scale down, making it even slower.

  3. It does progressive rendering (internally, pages are rendered as a series of chunks), so you can produce very high resolution output files if you wish.

Poppler is GPL, so you do need to be a little careful if you are distributing a program built using it.

jcupitt
  • 10,213
  • 2
  • 23
  • 39