I am trying to print text from pdf using PyPDF2. Here is my code:
import PyPDF2
pdf_file = open('report.pdf', 'rb')
read_pdf = PyPDF2.PdfFileReader(pdf_file)
number_of_pages = read_pdf.getNumPages()
page = read_pdf.getPage(1)
page_content = page.extractText()
print (page_content.encode('utf-8'))
In result I am getting empty line with some warning.
PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will be corrected. [pdf.py:1736]
b''
I have checked that this warning itself does not impact the results but in my case I am getting nothing. Any suggestions. Thanks