0

I'd like to ask if someone has ever dealt with the same type of error with different files. My friend and I did a code to convert PDF files to JPG images, but quite often it gives an error like:

"wand.exceptions.CorruptImageError: unable to read image data C:/Users/ACERES~1/AppData/Local/Temp/magick-1364LojCg7PrnIwH1' @ error/pnm.c/ReadPNMImage/1344
Exception TypeError: TypeError("object of type 'NoneType' has no len()",) in <bound method Image.__del__ of <wand.image.Image: (empty)>> ignored"

The files we are working with are not corrupted and they do open propely. Not only that, there quite a number of PDFs that work fine and this error doesn't show up.

Any ideas on to solve that?

Thanks for any tips and effort, in advance.

We are using:

  • Windows 10 64-bits
  • JetBrains PyCharm Community Edition 2018.1 x64;
  • ImageMagick-6.9.9-50-Q8-x64-dll (Win64 dynamic at 8 bits-per-pixel component);
  • the following libraries: { from wand.image import Image from pyPdf import PdfFileReader, PdfFileWriter import io import glob import os.path };

The error occurs here:

from wand.image import Image
from pyPdf import PdfFileReader, PdfFileWriter
import io
import glob
import os.path

for page_num in range(reader.getNumPages()):
    writer = PdfFileWriter()
    writer.addPage(reader.getPage(page_num))
    # outputStream = file(p, "w+")
    pdf_bytes = io.BytesIO()
    writer.write(pdf_bytes)
    pdf_bytes.seek(0)
    img = Image(file=pdf_bytes, resolution=100)  # (*error in this line)
    img.convert("jpg")
    img.save(filename=dayName + "/" + str(page_num) + ".jpg")
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • "Not only that, there quite a number of PDFs that work fine and this error doesn't show up." This indicates there is a problem with some files. Do all the files come from the same place and are created with the same software? If not what do all the bad ones come from the same place or are created with the same software? What other software have you opened them in and have you tried more than one? – Bonzo Jun 06 '18 at 15:20
  • Is writing each page to a single JPG all that this code is doing? You might be able to eliminate the writer/ByteIO, and just use `img.save(filename=dayName+'/%d.jpg')` – emcconville Jun 06 '18 at 15:25
  • Perhaps your ImageMagick temp directory is getting full. The error is for a temporary image that ImageMagick creates in that directory. So perhaps there was not enough space in that temp directory and the image there was corrupted by an abnormal termination. – fmw42 Jun 06 '18 at 22:41
  • Might it be possible that the PDF files that work are RGB and the ones that fail are CMYK? That is, your code is expecting 3 channels and your are providing 4 with CMYK. – fmw42 Aug 23 '18 at 03:17

0 Answers0