I'm trying to convert some PDFs to high res jpegs using imagemagick . I'm working on win 10, 64 with python 3.62 - 64 bit and wand 0.4.4. At the command line I have :
$ /e/ImageMagick-6.9.9-Q16-HDRI/convert.exe -density 400 myfile.pdf -scale 2000x1000 test3.jpg.
which is working well for me.
In python:
from wand.image import Image
file_path = os.path.dirname(os.path.abspath(__file__))+os.sep+"myfile.pdf"
with Image(filename=file_path, resolution=400) as image:
image.save()
image_jpeg = image.convert('jpeg')
Which is giving me low res JPEGs . How do I translate this into my wand code to do the same thing?
edit:
I realized that the problem is that the input pdf has to be read into the Image object as a binary string, so based on http://docs.wand-py.org/en/0.4.4/guide/read.html#read-blob I tried:
with open(file_path,'rb') as f:
image_binary = f.read()
f.close()
with Image(blob=image_binary,resolution=400) as img:
img.transform('2000x1000', '100%')
img.make_blob('jpeg')
img.save(filename='out.jpg')
This reads the file in ok, but the output is split into 10 files. Why? I need to get this into 1 high res jpeg.
EDIT:
I need to send the jpeg to an OCR api, so I was wondering if I could write the output to a file like object. Looking at https://www.imagemagick.org/api/magick-image.php#MagickWriteImageFile, I tried :
emptyFile = Image(width=1500, height=2000)
with Image(filename=file_path, resolution=400) as image:
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand,
True)
library.MagickWriteImagesFile(resource_pointer,emptyFile)
This gives:
File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 113, in <module>
test_file = ocr_stream(filename='test4.jpg')
File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 96, in ocr_stream
library.MagickWriteImagesFile(resource_pointer,emptyFile)
ctypes.ArgumentError: argument 2: <class 'TypeError'>: wrong type
How can I get this working?