PyPDF2 complete clone of file

Question

I am trying to copy a PDF in its entirety using PyPDF2, the following code copies the content but not the outline of the pdf.

here is a sample pdf and use the code as follows python test.py <input pdf> <output dest>

Here is the code that I have so far.

from PyPDF2 import PdfFileWriter, PdfFileReader
import sys
import os.path

def main(argv):
    if not os.path.isfile(argv[0]) and \
    not os.path.isfile(argv[1]):
        print("Invalid path")
        sys.exit()
    input_pdf = PdfFileReader(open(argv[0], "rb"))
    output_pdf = PdfFileWriter()
    input_pdf_pages = input_pdf.getNumPages()
    for i in range(0, input_pdf_pages):
        output_pdf.addPage(input_pdf.getPage(i))
    output_pdf.write(open(argv[1], "wb"))

if __name__ == "__main__":
    main(sys.argv[1:])

If you want to end up with a perfect copy, why not copy the entire file instead? — Jongware, Feb 04 '18 at 00:00
I am trying to modify contents of the pdf, but the first step is to make a proper copy of the pdf. — sp00kyb00g13, Feb 04 '18 at 01:25

score -1 · Answer 1 · answered Apr 21 '19 at 16:31

-1

PdfFileWriter does have a number of methods for copying an entire file: appendPagesFromReader, cloneReaderDocumentRoot, and cloneDocumentFromReader.

However, I can't get them to work properly either. ;-) You might have better luck.

answered Apr 21 '19 at 16:31

benwiggy

1,440
17
35

score -1 · Answer 2 · answered May 31 '21 at 18:45

-1

probably not a 100% exact replica, but...

for i in range(input_pdf.getNumPages()): output_pdf.addPage(input_pdf.getPage(i))

answered May 31 '21 at 18:45

Cathy Chin

1

This is already mentioned in the question: `for i in range(0, input_pdf_pages): output_pdf.addPage(input_pdf.getPage(i))`. Make sure you read the question thoroughly before answering. – Boris May 31 '21 at 22:35

PyPDF2 complete clone of file

2 Answers2

Linked