2

I'm trying to attach a file to a PDF file, but I'm running into some issues. I'm not sure if I'm doing something wrong or if there's a bug in PyPDF2. I'm using Python 3.10.2 for this and I downloaded the newest package for PyPDF2 through pip.

These are 3 versions of code that I tried using, but each has its own issues.

  1. This code copies the PDF properly, but the attachment fails silently. I can confirm the failure because the file size didn't grow.
pdfFile = open("input.pdf", "rb")
reader = PdfReader(pdfFile)
writer = PdfWriter()
writer.clone_document_from_reader(reader) # this line is different
pdfFile.close()

with open("image.png", "rb") as file:
    writer.add_attachment("image", file.read())
with open("output.pdf", "wb") as file:
    writer.write(file)
  1. This code is slightly different than the one before, but also fails to attach the file.
pdfFile = open("input.pdf", "rb")
reader = PdfReader(pdfFile)
writer = PdfWriter()
writer.clone_reader_document_root(reader) # this line is different
writer.append_pages_from_reader(reader) # this line is different
pdfFile.close()

with open("image.png", "rb") as file:
    writer.add_attachment("image", file.read())
with open("output.pdf", "wb") as file:
    writer.write(file)
  1. This code actually does attach the file, but upon opening the file in Adobe Acrobat, I get the error: "The was an error opening this document. The root object is missing or invalid." I don't see any API calls for creating a root object manually in PyPDF2.
pdfFile = open("input.pdf", "rb")
reader = PdfReader(pdfFile)
writer = PdfWriter()
writer.append_pages_from_reader(reader) # this line is different
pdfFile.close()

with open("image.png", "rb") as file:
    writer.add_attachment("image", file.read())
with open("output.pdf", "wb") as file:
    writer.write(file)

Funny enough, I don't get the error if I run the 3rd version of the code without attaching the file. Then it just works like the first 2 versions.

bblizzard
  • 618
  • 5
  • 7
  • 2
    If this module fails (so much), why not trying another one? e.g. `PyMuPDF` [pip doc PyMuPDF](https://pypi.org/project/PyMuPDF/) – Memristor May 03 '23 at 08:24
  • 2
    As @Memristor said: it is as simple as `doc.insert_file(filename)` for a PDF document `doc` and some file that is supported as a `Document` in PyMuPDF (images among several more). – Jorj McKie May 03 '23 at 09:50

1 Answers1

0

Upgrade to a more recent version of the PyPDF2 library. As of version 3 it's now called pypdf, and using 3.9.1 (latest version as of 2023-06-09) this should produce a working PDF with attachment (at least it does for me).

from pypdf import PdfReader, PdfWriter
pdfFile = open("input.pdf", "rb")
reader = PdfReader(pdfFile)
writer = PdfWriter()
writer.append_pages_from_reader(reader) 
pdfFile.close()

with open("image.png", "rb") as file:
    writer.add_attachment("image.png", file.read())
with open("output.pdf", "wb") as file:
    writer.write(file)

Hope that helps!


Edit: On further testing I'm having some difficulty attaching files to pdf even with the latest version of pypdf. (Though the error you complain about seems to be fixed, I'm occasionally getting files that don't attach properly i.e. they appear to be attached but won't actually open when clicked on from within the pdf).

I have switched to attaching with pikepdf and it's working beautifully.

ffalcon
  • 1
  • 3