0

I'm getting some weird output files from trying to merge a couple of PDF files using pandas and PyPDF2.

I have a single page PDF (certificate) I need to merge with a two page document which is common to all. Then name the resulting output file for the person named in the origin file. As there's a reasonable number I wanted to automate it

I'm not fluent in python, I sort of stumble my way through but I'm lost as to why some of the output files have >3500 pages and others just a few and why none are correct.

Run one number at a time it works but not when I try an loop all records . I'd really welcome some help I'm assuming its something obvious I can't see.

My code is below:

from PyPDF2 import PdfFileReader, PdfFileMerger
import pandas as pd

def create_pdf(x):
    file2 = outs[x]
    file1 = certs[x]
    input1 = open(path + file1, "rb")
    input2 = open(path + 'insert.pdf', "rb")

    output = open(path2 + file2, "wb")
    merger.append(fileobj=input1, pages=(0, 1), import_bookmarks=False)
    merger.append(input2)
    merger.write(output)
    output.close()
    return

df = pd.read_csv('Affiliate Data.csv', encoding='latin1', na_values=['nan'], keep_default_na=False)

path = 'D:\\input_file Location\\'
path2 = 'D:\\Output_file_Location\\'
merger = PdfFileMerger()
pdf_files = []
certs = df['infile'].tolist()
outs= df['outfile'].tolist()
x=0

while x < 605 :
    create_pdf(x)

Thanks in advance. J

James
  • 31
  • 2

1 Answers1

0

okay it was obvious I hadn't closed the files.

from PyPDF2 import PdfFileReader, PdfFileMerger

import pandas as pd

def create_pdf(x):
    file2 = outs[x]
    file1 = certs[x]
    input1 = open(path + file1, "rb")
    input2 = open(path + 'insert.pdf', "rb")

    output = open(path2 + file2, "wb")
    merger.append(fileobj=input1, pages=(0, 1), import_bookmarks=False)
    merger.append(input2)
    merger.write(output)
    output.close()
    # *****Solution close the input files******
    input1.close()
    input2.close()

    return

df = pd.read_csv('Affiliate Data.csv', encoding='latin1', na_values=['nan'], keep_default_na=False)

path = 'D:\\input_file Location\\'
path2 = 'D:\\Output_file_Location\\'
merger = PdfFileMerger()
pdf_files = []
certs = df['infile'].tolist()
outs= df['outfile'].tolist()
x=0

while x < 605 :
    create_pdf(x)
    # *****Solution close and reopen the file merger******
    merger.close()
    merger = PdfFileMerger()

    x=x+1

Hat tip to my father for pointing out the obvious flaw.

I wasn't looking forward to doing all these manually.

J

James
  • 31
  • 2