I have a function that takes in PDF file path as input and splits it into separate pages as shown below:
import os,time
from pyPdf import PdfFileReader, PdfFileWriter
def split_pages(file_path):
print("Splitting the PDF")
temp_path = os.path.join(os.path.abspath(__file__), "temp_"+str(int(time.time())))
if not os.path.exists(temp_path):
os.makedirs(temp_path)
inputpdf = PdfFileReader(open(file_path, "rb"))
if inputpdf.getIsEncrypted():
inputpdf.decrypt('')
for i in xrange(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(i))
with open(os.path.join(temp_path,'%s.pdf'% i),"wb") as outputStream:
output.write(outputStream)
It works for small files but the problem is that It only splits for first 0-151 pages when the PDF has more than 152 pages and stops after that. It also sucks out all the memory of the system before I kill it.
Please let me know what I'm doing wrong or where the problem is occurring and how do I correct it?