My goal is to read a directory with several PDF files and return the number of pages in each file using Python. I'm trying to use the pyPdf library but it fails.
If I do this:
from pyPdf import PdfFileReader
testFile = "C:\\path\\file.pdf"
pdfFile = PdfFileReader(file(testFile, 'rb'))
print pdfFile.getNumPages()
I'll get a result
If I do this, it fails:
pdfList = []
for root, dirs, files in os.walk("C:\\path"):
for file in files:
pdfList.append(os.path.join(root, file)
for item in pdfList:
targetPdf = PdfFileReader(file(item,'rb'))
numPages = targetPdf.getNumPages()
print item, numPages
This always results in:
TypeError: 'str' object is not callable
If I try to recreate a pyPdf object manually, I get the same thing.
What am I doing wrong?