I am using PDFMerger from PyPDF2. My program is basically reading all PDFs in a folder and merges them into a single one. I have made a test with 15 PDF files each is 99kb and it worked like a charm. Whole process was finished within a second. However when I tried with large numbers process took too long then I anticipated. I have tried merging 1000 files each is 99kb, reading and appending all these PDFs took 3 seconds in total but when it comes to writing the PDF it took line 67 seconds. I have tried 2 levels of merging (500 into 1 and other 500 into other 1 then merging the final 2) but it around same duration. Is there any way to speed up this writing process ?
I am adding my code below.
merger = PdfMerger()
for pdf in dirs:
if pdf.endswith('pdf'):
merger.append(pdf)
merger.write(filename)
merger.close()
My PyPDF2 version is 2.11.2. Input file size is 99kb with 1 page Output file size for 1000x99kb is 20.050kb