Convert PDF images to PNG using Python on macOS

Question

I have read a lot of articles that try to describe how to convert PDF's to a PNG image. But I simply cannot get it working. I tried to import PythonMagick on top of my script but it returns the error ImportError: No module named PythonMagick.

Is it possible to install PythonMagick as easy as shell tools via Homebrew?! The background is my Python script, which is much shorter than the equivalent Bash script. The only thing that is not working is the PDF to PNG conversion and scaling of the final image. In Bash, I use Imagemagick for this, but I want to do this in Python too, since it is a one liner.

Any Ideas?

EDIT

The code can be found on Github: https://github.com/Blackjacx/Scripts/blob/master/iconizer.py

SOLUTION FOUND

Using MagickWand works better so I am using this. To install it I did:

$ brew install imagemagick@6
$ export MAGICK_HOME=/usr/local/opt/imagemagick@6

How to use that? can't get it really to work since I'm new to Python..: — blackjacx, Oct 22 '17 at 22:42
Sorry, fpdf is not the right choice probably but have a look at this link using PythonMagick http://www.xavierdupre.fr/blog/2014-03-12_nojs.html. Can you also post your code — Chetan_Vasudevan, Oct 22 '17 at 22:45
Oh sorry yes my code can be fund at: https://github.com/Blackjacx/Scripts/blob/master/iconizer.py — blackjacx, Oct 22 '17 at 22:46
You shouldn't amend the question to add an answer, but add a proper answer instead (self-answering is [fine](https://stackoverflow.blog/2011/07/01/its-ok-to-ask-and-answer-your-own-questions/)). — Benjamin W., Oct 22 '17 at 23:31

Chetan_Vasudevan · Answer 1 · 2017-10-22T22:54:44.857

1

Try this for the error ImportError: No module named PythonMagick

Also have a look at this link

Try changing: from . import _PythonMagick to import _PythonMagick in you init.py of PythonMagick

edited Oct 22 '17 at 22:54

answered Oct 22 '17 at 22:48

Chetan_Vasudevan

2,414
1
13
34

Where can I find this file? – blackjacx Oct 22 '17 at 22:53
site-packages directory – Chetan_Vasudevan Oct 22 '17 at 22:54

score 1 · Answer 2 · answered Oct 23 '17 at 01:12

1

I don't recommend you to use imagemagick. Because this tools render output by pixels but not vectors inside the pdf file. So if your pdf file's original resolution is much lower than the resolution of your output png file, it will be a quality loss.

Try to use mupdf. The command mudraw you should use is various decided by version. Most of time it should be:

mudraw [-h 1080] [-w 1080] [-o <output_path>] <input_path>

This tool could manipulate vectors so there won't be any quality loss not matter how you zoom your original file.

answered Oct 23 '17 at 01:12

Sraw

18,892
11
54
87

I am going to try mupdf. Meanwhile, can you confirm what I understood from your answer please? Currently, I am zooming in to 1600% a PDF file and then taking a snapshot and saving it as PNG file so up to a very large extent the resolution is good but after that I start seeing pixelation effect. Of course PDF is a vectorized image whereas PNG is a rasterized one. Will using mupdf allow to preserve the original resolution after conversion to PNG? – SKR Oct 10 '18 at 03:21
@SKR As long as your pdf is vector-based. I mean, you know, pdf can still contains raw image right? That is not vector-based. But vectorgraphs which are vector-based images can be zoomed in without any loss. – Sraw Oct 10 '18 at 03:25
Yes, you're right, an image scanned through a scanner using a pre-defined DPI converted to PDF still has limitations. What I'm using is entirely vector-based, even at 6400% it is crystal clear. So I zoom to 1600% and take snapshot and convert to PNG because I've to perform OCR. Doing this manually for each PDF image is cumbersome so I am looking to automate this process or some utility which can do PDF (pure vector) to PNG with good enough resolution. Is mupdf a free software? – SKR Oct 10 '18 at 03:39
Sure it is free. – Sraw Oct 10 '18 at 03:41

benwiggy · Answer 3 · 2022-09-08T15:03:04.900

Apple has removed the long dead python2, but if you have installed python 3 (and the pyobjc library), you can use Apple's own CoreGraphics APIs. The following python 3 script will convert PDF files, supplied as arguments, to PNG.

It can also be used in Automator's "Run Shell Script" action.

#!/usr/bin/env python3

"""
PDF2PNG v.3.0: Creates a bitmap image from each page of each PDF supplied to it.
by Ben Byram-Wigfield
Now written for python3. You may need to install pyobjc with pip3.

"""
import os, sys
import Quartz as Quartz
# from LaunchServices import (kUTTypeJPEG, kUTTypeTIFF, kUTTypePNG, kCFAllocatorDefault) 

kUTTypeJPEG = 'public.jpeg'
kUTTypeTIFF = 'public.tiff'
kUTTypePNG = 'public.png'
kCFAllocatorDefault = None

resolution = 300.0 #dpi
scale = resolution/72.0

cs = Quartz.CGColorSpaceCreateWithName(Quartz.kCGColorSpaceSRGB)
whiteColor = Quartz.CGColorCreate(cs, (1, 1, 1, 1))
# Options: Quartz.kCGImageAlphaNoneSkipLast (no trans), Quartz.kCGImageAlphaPremultipliedLast 
transparency = Quartz.kCGImageAlphaNoneSkipLast

#Save image to file
def writeImage (image, url, type, options):
    destination = Quartz.CGImageDestinationCreateWithURL(url, type, 1, None)
    Quartz.CGImageDestinationAddImage(destination, image, options)
    Quartz.CGImageDestinationFinalize(destination)
    return

def getFilename(filepath):
    i=0
    newName = filepath
    while os.path.exists(newName):
        i += 1
        newName = filepath + " %02d"%i
    return newName

if __name__ == '__main__':

    for filename in sys.argv[1:]:
        filenameNonU = filename.encode('utf8')
        pdf = Quartz.CGPDFDocumentCreateWithProvider(Quartz.CGDataProviderCreateWithFilename(filenameNonU))
        print(pdf, filenameNonU)
        numPages = Quartz.CGPDFDocumentGetNumberOfPages(pdf)
        shortName = os.path.splitext(filename)[0]
        prefix = os.path.splitext(os.path.basename(filename))[0]
        folderName = getFilename(shortName)
        try:
            os.mkdir(folderName)
        except:
            print("Can't create directory '%s'"%(folderName))
            sys.exit()
        # For each page, create a file
        for i in range (1, numPages+1):
            page = Quartz.CGPDFDocumentGetPage(pdf, i)
            if page:    
        #Get mediabox
                mediaBox = Quartz.CGPDFPageGetBoxRect(page, Quartz.kCGPDFMediaBox)
                x = Quartz.CGRectGetWidth(mediaBox)
                y = Quartz.CGRectGetHeight(mediaBox)
                x *= scale
                y *= scale
                r = Quartz.CGRectMake(0,0,x, y)
        # Create a Bitmap Context, draw a white background and add the PDF
                writeContext = Quartz.CGBitmapContextCreate(None, int(x), int(y), 8, 0, cs, transparency)
                Quartz.CGContextSaveGState (writeContext)
                Quartz.CGContextScaleCTM(writeContext, scale,scale)
                Quartz.CGContextSetFillColorWithColor(writeContext, whiteColor)
                Quartz.CGContextFillRect(writeContext, r)
                Quartz.CGContextDrawPDFPage(writeContext, page)
                Quartz.CGContextRestoreGState(writeContext)
        # Convert to an "Image"
                image = Quartz.CGBitmapContextCreateImage(writeContext) 
        # Create unique filename per page
                outFile = folderName +"/" + prefix + " %03d.png"%i
                outFile_nonU = outFile.encode('utf8')
                url = Quartz.CFURLCreateFromFileSystemRepresentation(kCFAllocatorDefault, outFile_nonU, len(outFile_nonU), False)
        # kUTTypeJPEG, kUTTypeTIFF, kUTTypePNG
                type = kUTTypePNG
        # See the full range of image properties on Apple's developer pages.
                options = {
                    Quartz.kCGImagePropertyDPIHeight: resolution,
                    Quartz.kCGImagePropertyDPIWidth: resolution
                    }
                writeImage (image, url, type, options)
                del page

Yes, to the type: I think compression is a bit more complex and may involve creating a CGImage object with particular properties. — benwiggy, Oct 13 '20 at 14:04

Convert PDF images to PNG using Python on macOS

3 Answers3