0

I have read a lot of articles that try to describe how to convert PDF's to a PNG image. But I simply cannot get it working. I tried to import PythonMagick on top of my script but it returns the error ImportError: No module named PythonMagick.

Is it possible to install PythonMagick as easy as shell tools via Homebrew?! The background is my Python script, which is much shorter than the equivalent Bash script. The only thing that is not working is the PDF to PNG conversion and scaling of the final image. In Bash, I use Imagemagick for this, but I want to do this in Python too, since it is a one liner.

Any Ideas?

EDIT

The code can be found on Github: https://github.com/Blackjacx/Scripts/blob/master/iconizer.py

SOLUTION FOUND

Using MagickWand works better so I am using this. To install it I did:

$ brew install imagemagick@6
$ export MAGICK_HOME=/usr/local/opt/imagemagick@6
martineau
  • 119,623
  • 25
  • 170
  • 301
blackjacx
  • 9,011
  • 7
  • 45
  • 56

3 Answers3

1

Try this for the error ImportError: No module named PythonMagick

Also have a look at this link

Try changing: from . import _PythonMagick to import _PythonMagick in you init.py of PythonMagick

Chetan_Vasudevan
  • 2,414
  • 1
  • 13
  • 34
1

I don't recommend you to use imagemagick. Because this tools render output by pixels but not vectors inside the pdf file. So if your pdf file's original resolution is much lower than the resolution of your output png file, it will be a quality loss.

Try to use mupdf. The command mudraw you should use is various decided by version. Most of time it should be:

mudraw [-h 1080] [-w 1080] [-o <output_path>] <input_path> 

This tool could manipulate vectors so there won't be any quality loss not matter how you zoom your original file.

Sraw
  • 18,892
  • 11
  • 54
  • 87
  • I am going to try mupdf. Meanwhile, can you confirm what I understood from your answer please? Currently, I am zooming in to 1600% a PDF file and then taking a snapshot and saving it as PNG file so up to a very large extent the resolution is good but after that I start seeing pixelation effect. Of course PDF is a vectorized image whereas PNG is a rasterized one. Will using mupdf allow to preserve the original resolution after conversion to PNG? – SKR Oct 10 '18 at 03:21
  • @SKR As long as your pdf is vector-based. I mean, you know, pdf can still contains raw image right? That is not vector-based. But vectorgraphs which are vector-based images can be zoomed in without any loss. – Sraw Oct 10 '18 at 03:25
  • Yes, you're right, an image scanned through a scanner using a pre-defined DPI converted to PDF still has limitations. What I'm using is entirely vector-based, even at 6400% it is crystal clear. So I zoom to 1600% and take snapshot and convert to PNG because I've to perform OCR. Doing this manually for each PDF image is cumbersome so I am looking to automate this process or some utility which can do PDF (pure vector) to PNG with good enough resolution. Is mupdf a free software? – SKR Oct 10 '18 at 03:39
  • Sure it is free. – Sraw Oct 10 '18 at 03:41
1

Apple has removed the long dead python2, but if you have installed python 3 (and the pyobjc library), you can use Apple's own CoreGraphics APIs. The following python 3 script will convert PDF files, supplied as arguments, to PNG.

It can also be used in Automator's "Run Shell Script" action.

#!/usr/bin/env python3

"""
PDF2PNG v.3.0: Creates a bitmap image from each page of each PDF supplied to it.
by Ben Byram-Wigfield
Now written for python3. You may need to install pyobjc with pip3.

"""
import os, sys
import Quartz as Quartz
# from LaunchServices import (kUTTypeJPEG, kUTTypeTIFF, kUTTypePNG, kCFAllocatorDefault) 

kUTTypeJPEG = 'public.jpeg'
kUTTypeTIFF = 'public.tiff'
kUTTypePNG = 'public.png'
kCFAllocatorDefault = None

resolution = 300.0 #dpi
scale = resolution/72.0

cs = Quartz.CGColorSpaceCreateWithName(Quartz.kCGColorSpaceSRGB)
whiteColor = Quartz.CGColorCreate(cs, (1, 1, 1, 1))
# Options: Quartz.kCGImageAlphaNoneSkipLast (no trans), Quartz.kCGImageAlphaPremultipliedLast 
transparency = Quartz.kCGImageAlphaNoneSkipLast

#Save image to file
def writeImage (image, url, type, options):
    destination = Quartz.CGImageDestinationCreateWithURL(url, type, 1, None)
    Quartz.CGImageDestinationAddImage(destination, image, options)
    Quartz.CGImageDestinationFinalize(destination)
    return

def getFilename(filepath):
    i=0
    newName = filepath
    while os.path.exists(newName):
        i += 1
        newName = filepath + " %02d"%i
    return newName

if __name__ == '__main__':

    for filename in sys.argv[1:]:
        filenameNonU = filename.encode('utf8')
        pdf = Quartz.CGPDFDocumentCreateWithProvider(Quartz.CGDataProviderCreateWithFilename(filenameNonU))
        print(pdf, filenameNonU)
        numPages = Quartz.CGPDFDocumentGetNumberOfPages(pdf)
        shortName = os.path.splitext(filename)[0]
        prefix = os.path.splitext(os.path.basename(filename))[0]
        folderName = getFilename(shortName)
        try:
            os.mkdir(folderName)
        except:
            print("Can't create directory '%s'"%(folderName))
            sys.exit()
        # For each page, create a file
        for i in range (1, numPages+1):
            page = Quartz.CGPDFDocumentGetPage(pdf, i)
            if page:    
        #Get mediabox
                mediaBox = Quartz.CGPDFPageGetBoxRect(page, Quartz.kCGPDFMediaBox)
                x = Quartz.CGRectGetWidth(mediaBox)
                y = Quartz.CGRectGetHeight(mediaBox)
                x *= scale
                y *= scale
                r = Quartz.CGRectMake(0,0,x, y)
        # Create a Bitmap Context, draw a white background and add the PDF
                writeContext = Quartz.CGBitmapContextCreate(None, int(x), int(y), 8, 0, cs, transparency)
                Quartz.CGContextSaveGState (writeContext)
                Quartz.CGContextScaleCTM(writeContext, scale,scale)
                Quartz.CGContextSetFillColorWithColor(writeContext, whiteColor)
                Quartz.CGContextFillRect(writeContext, r)
                Quartz.CGContextDrawPDFPage(writeContext, page)
                Quartz.CGContextRestoreGState(writeContext)
        # Convert to an "Image"
                image = Quartz.CGBitmapContextCreateImage(writeContext) 
        # Create unique filename per page
                outFile = folderName +"/" + prefix + " %03d.png"%i
                outFile_nonU = outFile.encode('utf8')
                url = Quartz.CFURLCreateFromFileSystemRepresentation(kCFAllocatorDefault, outFile_nonU, len(outFile_nonU), False)
        # kUTTypeJPEG, kUTTypeTIFF, kUTTypePNG
                type = kUTTypePNG
        # See the full range of image properties on Apple's developer pages.
                options = {
                    Quartz.kCGImagePropertyDPIHeight: resolution,
                    Quartz.kCGImagePropertyDPIWidth: resolution
                    }
                writeImage (image, url, type, options)
                del page
                
benwiggy
  • 1,440
  • 17
  • 35
  • Yes, to the type: I think compression is a bit more complex and may involve creating a CGImage object with particular properties. – benwiggy Oct 13 '20 at 14:04