Questions tagged [pypdf]

pypdf is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retrieve text and metadata from PDFs as well as merge entire files together.

A Pure-Python library built as a PDF toolkit. It is capable of:

extracting document information (title, author, ...),
splitting documents page by page,
merging documents page by page,
cropping pages,
merging multiple pages into a single page,
encrypting and decrypting PDF files.

By being Pure-Python, it should run on any Python platform without any dependencies on external libraries. It can also work entirely on StringIO objects rather than file streams, allowing for PDF manipulation in memory. It is therefore a useful tool for websites that manage or manipulate PDFs.

pypdf was inactive from 2010 to 2022. It got maintained in December 2022 again.

Relationship to PyPDF2

PyPDF2 was a fork of pyPdf.

PyPDF2 received a lot of updates in 2022, but PyPDF2 was deprecated in favor of pypdf.

pypdf==3.1.0 is essentially the same as PyPDF2==3.0.0. Just the package name was changed to pypdf.

See: https://pypdf.readthedocs.io/en/latest/meta/history.html

Links

1451 questions

-1

votes

1 answer

How can I interchangeably use glob.glob("*PDF) and os.listdr("./directory")?

I am trying to merge PDF files inside a folder I tried running the code from the same directory and it worked however when I copied the code to a different location and specified the directory path of PDF files, the merging process is not happening…

python pypdf

asked Jan 14 '21 at 20:25

sebastian

-1

votes

1 answer

Challanges with Pdf/a file for extraction using Python

We have some PDF/A files for extraction and when we try to use standard pdf extraction libraries, nothing is returned from program for entire page. same program is working perfectly fine for standard pdfs and retuning values. Can anyone help how to…

python text-extraction pypdf pdfa

asked Dec 07 '20 at 10:46

Denish

-1

votes

1 answer

Split image/pdf based on specific text with Python

I want to split a pdf (or image if needed) based on text in it. I want to split it to get each question with its options in the pdf/image, separately like a screenshot of just that question and its options. Sample PDF…

python opencv pdf ocr pypdf

asked Dec 04 '20 at 12:06

Whiskey Jay

-1

votes

1 answer

AttributeError: '_io.BufferedReader' object has no attribute 'page

`I am trying to extract text from pdf file which consists of text, tables, and images. and want to save that file on local system. This was the code i was developing. from PyPDF2 import PdfFileReader # Load the pdf to the PdfFileReader object with…

python pdf text-extraction pypdf

asked Nov 02 '20 at 08:38

netha

-1

votes

1 answer

How do I fix this error when installing pyPDF2 in Python

I receive the following error when trying to install pyPDF2 using following text at the command prompt: python -m pip install pyPDF2 Any suggestions to resolve? Error result: Microsoft Windows [Version 10.0.19042.572] (c) 2020 Microsoft Corporation.…

python pip pypdf

asked Nov 01 '20 at 03:26

vitaminC

-1

votes

1 answer

I have converted a pdf file to csv using anaconda python3 But the converted csv file is not in a readable form how to make it readable?

# importing required modules import PyPDF2 # creating a pdf file object pdfFileObj = open(path, 'rb') # creating a pdf reader object pdfReader = PyPDF2.PdfFileReader(pdfFileObj) # printing number of pages in pdf file…

python pandas csv pypdf

asked Oct 30 '20 at 15:54

Jawahar

-1

votes

1 answer

Reading from pdf file to text yields no results

So I'm trying something very simple: I just want to read text from a pdf file in to a variable - that's it. This is what I'm getting: Does anyone know a reliable way to just read pdf in to a text file?

python pdf text filereader pypdf

asked Sep 01 '20 at 18:07

Rasmus Edvardsen

-1

votes

1 answer

How to flip a pdf page upside down using python?

I'm trying to flip pdf pages upside down using python. I have tried multiple libraries like PyPdf2, PyMuPDF and pdfminer. There is documentation on how to rotate a page, but that is not what I'm looking for. The closest solution I found was on one…

python pdf pypdf pdfminer pymupdf

asked Aug 23 '20 at 10:09

Ajay Alex

-1

votes

1 answer

What is wrong with this PDF when trying to get a word count

I am trying to write a python app to give me a word count for PDFs. I've run into something odd with this PDF though. When I extract the text from the PDF, it shows up as some sort of binary/symbol garbage. I have tried PyPDF2 and PyMuPDF libs with…

python python-3.x pdf pypdf pymupdf

asked Aug 04 '20 at 01:33

tynick

-1

votes

1 answer

How to split a PDF every 4 pages using PyPDF2 in python?

Found a sample code online that splits a pdf into 2 pages but couldn't figure to change it to 4 pages, any tips will be appreciated #!/usr/bin/env python3 from PyPDF2 import PdfFileWriter, PdfFileReader import glob, sys pdfs =…

python python-3.x pdf pypdf

asked Apr 19 '20 at 17:52

FAN360

-1

votes

1 answer

How to convert output into a pdf file

Say if I have some functions, in this case below a function which calculates the mode and another function to calculate the mean of a list of numbers, and then followed by printing a statement 'Hello World!' and finally followed by printing a…

python pdf reportlab pypdf pdfdocument

asked Apr 04 '20 at 23:41

Leockl

1,906
5
18
51

-1

votes

1 answer

Use Python to determine if PDF was generated by Google Docs

I'd like to use Python to tell if a PDF was created by Google Docs. Is there any sort of metadata I can gather with PyPDF2 to determine this?

python pdf google-docs pypdf

asked Mar 26 '20 at 22:09

Arya

1,382
2
15
36

-1

votes

1 answer

How can I rotate every page in a PDF with Python / PyPDF4?

I scanned a bunch of papers into a pdf but they seem to all be rotated, is there a way to rotate the pages with python? I did see the question in Python - Batch rotate pdf with PyPDF2 but am looking for a more generic solution.

python pdf pypdf

asked Mar 21 '20 at 21:41

Vivek Gani

1,283
14
28

-1

votes

1 answer

The extractText() fucntion does not return text

pdfFileObject = open('MDD.pdf', 'rb') pdfReader = PyPDF2.PdfFileReader(pdfFileObject) count = pdfReader.numPages for i in range(count): page = pdfReader.getPage(i) print(page.extractText() Above is my code and when i run the script it just…

python python-3.x pypdf

asked Jan 26 '20 at 15:14

danited

-1

votes

2 answers

Unable to import PyPDF2 after installing

I have installed PyPDF2 via pip3 install PyPDF2. The installation was successful. I am trying to import into Python unsuccessfully, and I do not know what is going on! I am using Python 3.7 After entering: from PyPDF2 import PdfFileReader The…

python pypdf

asked Dec 29 '19 at 09:30

chickenwings123

Prev 1 2 3

…

96 97 Next