0
pdf_file=fitz.open(r"C:\Users\user\Downloads\example.pdf")
for page_index in range(len(pdf_file)):
            page=pdf_file[page_index]
            print(page.get_pixmap())
OSError: cannot write mode PA as PNG

How i can get images from pdf file ?

I try to get images from pdf file

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958

1 Answers1

1

The documentation for the PyMuPDF library you're using has an explicit section on extracting images from PDF documents, with this example code (which is a bit too long to include here, and under the GPL anyway).

It simplifies to something like

import fitz

doc = fitz.open(filename)
seen_xrefs = set()
for page_num in range(doc.page_count):
    for img in doc.get_page_images(page_num):
        xref = img[0]
        if xref in seen_xrefs:
            continue
        image = doc.extract_image(xref)
        imgfile = f"img{xref:05d}.{image['ext']}"
        with open(imgfile, "wb") as fout:
            fout.write(image["image"])
        seen_xrefs.add(xref)
        print(f"Page {page_num}: {imgfile} ({image['width']} x {image['height']}")

when not taking masks and color spaces into account.

AKX
  • 152,115
  • 15
  • 115
  • 172