Getting PDF Version using Python

Question

I need to extract the PDF version from a PDF document. I tried PDF miner but it provides the below info only:

PDF Producer
Created
Modified
Application

Below is the code I tried:

from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument

fp = open("ibs.servlets.pdf", 'rb')
parser = PDFParser(fp)
doc = PDFDocument(parser)
parser.set_document(doc)
if len(doc.info) > 0:
   info = doc.info[0]
   print(info)

Is there any other libraries apart from pdf miner I can use?

You could [use `pypdf`](https://stackoverflow.com/a/76188966/562769) — Martin Thoma, May 06 '23 at 12:47

Frodon · Accepted Answer · 2020-11-30T14:14:17.770

2

The PDF version is stored as a comment in the first line of the PDF file. I couldn't find how to get this information using pdfparser, but using PyPDF2 I could retrieve this information manually:

from PyPDF2.pdf import PdfFileReader
doc = PdfFileReader('ibs.servlets.pdf')
doc.stream.seek(0) # Necessary since the comment is ignored for the PDF analysis
print(doc.stream.readline().decode())

Output:

%PDF-1.5

edited Nov 30 '20 at 14:14

answered Nov 30 '20 at 14:08

Frodon

3,684
1
16
33

1

Thanks @Frodon. To add to this, can we get the full version of the PDF like this: 1.4 (Acrobat 5.x) – Sriram Nov 30 '20 at 15:08
Glad to help. Feel free to accept my answer – Frodon Nov 30 '20 at 15:19

Getting PDF Version using Python

1 Answers1

Linked