9

What is the best way to correct black background when converting multi page PDF to JPG with Imagick php extension?

Following is the code used on my application:

    $imagick = new Imagick($file);
    $imagick->setResolution(150,150);
    $imagick->setImageFormat("jpg");
    $imagick->setImageCompression(imagick::COMPRESSION_JPEG);
    $imagick->setImageCompressionQuality(70);
    foreach ($imagick as $c => $_page) {
        $_page->setImageBackgroundColor('white');
        $_page->adaptiveResizeImage($maxsize,$maxsize,true);
        $_page->writeImage("$file-$c.jpg");
    }

I'am aware that the flattenImage method can be used to remove black background, such as in:

    $imagick = $imagick->flattenImages();

But when the file has more the one pages, the flattenImages method puts all the pages on the same image, and therefore the result is a copy of the last page in all the JPGs generated.

I appreciate if anybody can help me.

fcaserio
  • 726
  • 1
  • 9
  • 18
  • Are you able to post an example PDF? – Danack Nov 07 '14 at 03:30
  • Sure, here is an example: http://www.faceo.com.br/temp/Manual%20Split%20hiwall%20YORK-1.pdf , and one of generated jpgs: http://www.faceo.com.br/temp/Manual%20Split%20hiwall%20YORK-1.pdf-10.jpg – fcaserio Nov 07 '14 at 14:31
  • 1
    Er, no pressure...but if it's solved your problem, how 'bout clicking that accept button? ;-) – Danack Nov 07 '14 at 23:56

2 Answers2

11

Working code first - explanation to follow:

This code works, but is incredibly slow:

$file = "./YORK.pdf";

$maxsize = 500;

$imagick = new Imagick($file);
$imagick->setResolution(150,150);
$imagick->setImageFormat("jpg");
$imagick->setImageCompression(imagick::COMPRESSION_JPEG);
$imagick->setImageCompressionQuality(70);

foreach ($imagick as $c => $_page) {
    $_page->setImageBackgroundColor('white');
    $_page->adaptiveResizeImage($maxsize,$maxsize,true);
    $_page->setImageCompose(\Imagick::COMPOSITE_ATOP);
    $_page->flattenImages();
    $_page->writeImage("$file-$c-compose.jpg");
}

This code works and is fast:

foreach ($imagick as $c => $_page) {
    $_page->setImageBackgroundColor('white');
    $_page->adaptiveResizeImage($maxsize,$maxsize,true);
    $blankPage = new \Imagick();
    $blankPage->newPseudoImage($_page->getImageWidth(), $_page->getImageHeight(), "canvas:white");
    $blankPage->compositeImage($_page, \Imagick::COMPOSITE_ATOP, 0, 0);
    $blankPage->writeImage("$file-$c.jpg");
}

What I think is happening is that when it comes to write the image ImageMagick is doing:

  • Convert the individual layers to JPG
  • Merge them on top of each other.

For each of the layers that has transparency because JPG doesn't support transparency it is rendering the transparency as black and then merging it. The code above makes the compositing be done in the correct order.

An alternative way to fix the problem is to put the output as PNG. As it supports transparency, the individual layers with transparency are merged correctly, and then you could convert the final image to JPG if you really wanted to.

Using PNG as the intermediate format may also produce a slightly higher quality output, as it may skip a 'save to JPG and decode' step. I do recommend using PNG in your workflow wherever possible, and then converting to JPG only when you serve a file to an end-user if you really need that extra bit of compression.

Danack
  • 24,939
  • 16
  • 90
  • 122
  • Tks man, good solution to put image on top of a new white canvas! – fcaserio Nov 07 '14 at 15:56
  • I still had a problems with a black background appearing on a page (for me it was last page with text+image). Replacing `Imagick::COMPOSITE_ATOP` with `Imagick::COMPOSITE_OVER` seems to be fixing it – MarcinWolny Jan 13 '15 at 15:19
0

Working code for multiple-page Pdf using Laravel

if ($request->has('pdf_file')) 
        {
            $getPdfFile = $request->file('pdf_file');

            $originalname = $getPdfFile->getClientOriginalName();

            $path = $getPdfFile->storeAs('PdfToJpg', $originalname);
        }

        // file name without extension
        $filename_without_ext = pathinfo($originalname, PATHINFO_FILENAME);

        //get the upload file
        $storagePath = storage_path('app/PdfToJpg/' . $originalname);

        $imagick = new Imagick();

        $imagick->setResolution(300, 300);
        
        $imagick->readImage($storagePath);

        $pages = (int)$imagick->getNumberImages();

        for ( $i = 0; $i < $pages; $i ++ ) {
            $imagick->readImage($storagePath . '[' . $i . ']' );
            $imagick->setImageCompressionQuality( 100 );
            $imagick->mergeImageLayers(Imagick::LAYERMETHOD_FLATTEN);
            $imagick->setImageAlphaChannel(Imagick::ALPHACHANNEL_REMOVE);
            $imagick->writeImage( storage_path('app/PdfToJpg/') . $i . '.jpg' );
        }
Y. Joy Ch. Singha
  • 3,056
  • 24
  • 26