4

Given a PDF document with multiple pages, how to check if a given page is rotated (-90, 90 or 180º)? Preferable using Python (pdfminer, pyPDF) ...

UPDATE: The pages are scanned, and most of the page is composed by text.

Dayvid Oliveira
  • 1,157
  • 2
  • 14
  • 34

3 Answers3

6

I used simply /Rotate attribute of the page in PyPDF2:

 pdf = PyPDF2.PdfFileReader(open('example.pdf', 'rb'))
 orientation = pdf.getPage(pagenumber).get('/Rotate')

it can be 0, 90, 180, 270 or None

In 2023, this should be:

 pdf = PyPDF2.PdfReader(open('example.pdf', 'rb'))
 orientation = pdf.pages[pagenumber].get('/Rotate')
Spherical Cowboy
  • 565
  • 6
  • 14
  • I know this is an old post, but why does his work? It obviously does (I'm using it) but as I review the PageObject Class documentation https://pythonhosted.org/PyPDF2/PageObject.html#PyPDF2.pdf.PageObject there is nothing on this page called "Rotate". How would I know that was an available parameter, and how would I know what other available parameters are? --- Thanks – Skinner Sep 16 '16 at 16:06
  • 1
    I think PageObject (as dict) contains all the original attributes of the page, like "/Parent", "/MediaBox" and all such things described in PDF Reference 7.7.3.3 – Vsevolod Sipakov Sep 16 '16 at 22:42
  • i couldnt make it work in non editable pdf though ,anyone noticed this issue ? – Godfather Jul 29 '19 at 20:01
  • As of PyPDF2 version `2.10.5`, the value returned can be larger than 270, such as 360 or 540. I know these are equivalent angles to 0 or 180, respectively, but just something to keep in mind if writing code that uses the return values. – Joe Oct 06 '22 at 14:50
  • From the source code located [here](https://github.com/py-pdf/PyPDF2/blob/main/PyPDF2/constants.py), under `class PageAttributes` is `ROTATE = "/Rotate" # integer, optional; page rotation in degrees` – Joe Oct 06 '22 at 15:21
0

If you're using pdfminer you can get the rotation by calling the .rotate attribute of PDFPage instance.

for page in PDFPage.create_pages(doc):
    interpreter.process_page(page)
    r = page.rotate
0

If you're using PDFMiner and want the orientation by each page:

from pdfminer.pdfpage import PDFPage
from io import StringIO
from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfpage import PDFPage
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams

output_string = StringIO()
resource_manager = PDFResourceManager()
device = TextConverter(resource_manager, output_string, 
laparams=LAParams())
interpreter = PDFPageInterpreter(resource_manager, device)

for page in PDFPage.get_pages(open('sample.pdf', 'rb')):
    interpreter.process_page(page)

    if page.mediabox[2] - page.mediabox[0] > page.mediabox[3] - page.mediabox[1]:
        orientation = 'Landscape'
    else:
        orientation = 'Portrait'