0

I am pulling out embedded images from pdf pages using PyMuPDF / Fitz. This works great but some pdf files, but for certain ones the image is rotated 90 deg. I don't see any condition that could be used to correct this. Has anyone experienced this? Anyone have a solution?

I always appreciate the help!

for img in doc.getPageImageList(i):
    xref = img[0]
    pix = doc.extractImage(xref)
    self.imagefilename = ("p%s-%s." % (i, xref)) + pix["ext"]
    imgout = open(self.imagefilename, 'wb')
    imgout.write(pix["image"])
    imgout.close()
TChi
  • 383
  • 1
  • 6
  • 14
  • I found this issue https://github.com/pymupdf/PyMuPDF/issues/335 The issue is closed but doesn't seem to resolve the problem. – TChi Mar 03 '20 at 21:10

2 Answers2

1

Message from the repo maintainer:

For the most recent PyMuPDF versions (v1.17.0 and up), I have decided to use the unrotated page for everything that can be inserted or modified. Also every information about object location on a page now pertains to the unrotated page. In addition there are complementary tools which allow transformations between the respective coordinate systems.

BTW: there is a PyMuPDF attribute Page.rotation which returns the page rotation. And you can set it via Page.setRotation(90).

Jorj McKie
  • 2,062
  • 1
  • 13
  • 17
0

I found the answer to my own question here:

https://stackoverflow.com/a/39324037/8222757

Using PyPDF2:

pdf = PyPDF2.PdfFileReader(open('example.pdf', 'rb'))
orientation = pdf.getPage(pagenumber).get('/Rotate')

The possible results can be 0, 90, 180, 270 or None

TChi
  • 383
  • 1
  • 6
  • 14