Questions tagged [pypdf]

pypdf is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retrieve text and metadata from PDFs as well as merge entire files together.

A Pure-Python library built as a PDF toolkit. It is capable of:

  • extracting document information (title, author, ...),
  • splitting documents page by page,
  • merging documents page by page,
  • cropping pages,
  • merging multiple pages into a single page,
  • encrypting and decrypting PDF files.

By being Pure-Python, it should run on any Python platform without any dependencies on external libraries. It can also work entirely on StringIO objects rather than file streams, allowing for PDF manipulation in memory. It is therefore a useful tool for websites that manage or manipulate PDFs.

pypdf was inactive from 2010 to 2022. It got maintained in December 2022 again.

Relationship to PyPDF2

PyPDF2 was a fork of pyPdf.

PyPDF2 received a lot of updates in 2022, but PyPDF2 was deprecated in favor of pypdf.

pypdf==3.1.0 is essentially the same as PyPDF2==3.0.0. Just the package name was changed to pypdf.

See: https://pypdf.readthedocs.io/en/latest/meta/history.html

Links

1451 questions
0
votes
1 answer

Why need with statements for reader and writer be nested?

with open(pdf,'rb') as fin: reader = PyPDF2.PdfFileReader(fin) new_pdf = PyPDF2.PdfFileWriter() for i in range(reader.numPages): new_pdf.addPage(reader.getPage(i)) out_file = pdf if not create_copy else self._new_copy(pdf) …
User1291
  • 7,664
  • 8
  • 51
  • 108
0
votes
1 answer

Python PDF split pages to specific path

I have created a function for a PDF page Splitter. I can choose a PDF file, save the path to pdfOne and after that I can choose what pages I want to split. The problem is that split pages goes in the same path as the original PDF. I don't want that,…
TLSK
  • 275
  • 1
  • 6
  • 25
0
votes
1 answer

Merging pages in multiple pdf documents with PyPDF2

I have been trying to mergePage with PyPDF2 using the same foreground to multiple pages in multiple documents with the following loop. for item in file_list: # loops through 16 pdf files print("Processing " + item) if item.endswith(".pdf"): …
Pugwash
  • 48
  • 5
0
votes
1 answer

How to update a field with PyPDF2

I'm trying to make a pdf generator and I'm almost there but can't figure out the final step of updating the form field. I'm using PyPDF2 in a Windows environment with Python 3.6 The first step is to download the pdf (of which there are many, though…
Oceanic_Panda
  • 112
  • 1
  • 13
0
votes
0 answers

how to create internal links using pypdf package in python

I am trying to link index with the content using internal link add_link() and set_link() in pyfpdf but its not working. rtype1=pdf.add_link() pdf.set_link(rtype1,y=0.0,page=-1) pdf.cell(0,8,"Link",0,1,'',False,link=rtype1) Any help will be…
rittik
  • 63
  • 5
0
votes
1 answer

Open pdf file with Django

I'm trying to merge two pdf files in Django with PyPDF2 and ReportLab. My view is as follows: @login_required def export_to_pdf(request, user_id): member = Member.objects.filter(user_id=user_id).values('user_id', …
Boky
  • 11,554
  • 28
  • 93
  • 163
0
votes
1 answer

Applying a UDF into a for loop - Python

Example of PDF: "Smith#00$Consolidated_Performance.pdf" The goal is to add a bookmark to page 1 of each PDF based on the filename. (Bookmark name in example would be "Consolidated Performance") import os from openpyxl import load_workbook from…
xTHx
  • 3
  • 1
0
votes
1 answer

Append a one page pdf to the end of all pdfs in a directory - python

I'm trying to add a 1 page pdf (lastpage) to the end of all invoice pdfs in a directory then rename the pdf as newname based on the filestart ('ICO_' + HH Name). Issue 1.) My code is summing the previous invoices on top of the 1 page (1 = 1 + last,…
theurlin
  • 181
  • 1
  • 1
  • 8
0
votes
1 answer

parsing the pdf file using PyPDF 2

by asynchronously what I mean to say is as you can see in the second screenshot, the address and phone details are getting mixedI have a task to parse a pdf file using python scripting with some specific attributes. I have to fetch first name, last…
0
votes
1 answer

Cannot merge pdf in python v3.6

I have the following code segment which has been tested to work in python ver2.7 The code merges multiple pdfs into a single pdf. from PyPDF2 import PdfFileMerger, PdfFileReader #merge individual pdfs of each page into a single pdf merger =…
user3848207
  • 3,737
  • 17
  • 59
  • 104
0
votes
4 answers

PyPDF2: Stream has ended unexpectedly

I have a Python script which uses PyPDF2 to reverse the order of pages of a PDF. from PyPDF2 import PdfFileWriter, PdfFileReader output = PdfFileWriter() rpage = [] name = input("What's the file called?") filename = name.split('.', 1) input1 =…
0
votes
0 answers

PyPdf2 extracting text with n in front of certain letters

This may just be due to PyPdf2's extract text function but when I run the code below in order to rename the files, a lot of the most common words come out like "Nthe", "Nfrom" and "Ncommunications". I'm not sure what I can do to stop this happening…
Trent
  • 1
  • 3
0
votes
4 answers

Read all PDFs in a directory (image)

I have attached an image to help show what I've done. I'm trying to write a program that will add a blank page to all PDFs in the directory that have an odd number of pages. However I can't seem to read all the PDFs in a directory. The script I have…
mbf94
  • 5
  • 1
  • 8
0
votes
1 answer

PDF File Security Settings

What are the Python modules/libraries which can be used to change or set the permissions of PDF file? I want to disable Print, Save, Save as, Copy for PDF file.
Rahul.Shikhare
  • 179
  • 1
  • 16
0
votes
2 answers

Writing pdf with pypdf2 gives error

I'm trying to write a simple script to merge two PDFs but have run into an issue when trying to save the output to disk. My code is from PyPDF2 import PdfFileWriter, PdfFileReader import tkinter as tk from tkinter import filedialog ### Prompt…
pgcudahy
  • 1,542
  • 13
  • 36