0

My goal is to read a directory with several PDF files and return the number of pages in each file using Python. I'm trying to use the pyPdf library but it fails.

If I do this:

from pyPdf import PdfFileReader

testFile = "C:\\path\\file.pdf"
pdfFile = PdfFileReader(file(testFile, 'rb'))
print pdfFile.getNumPages()

I'll get a result

If I do this, it fails:

pdfList = []
for root, dirs, files in os.walk("C:\\path"):
   for file in files:
     pdfList.append(os.path.join(root, file)

for item in pdfList:
  targetPdf = PdfFileReader(file(item,'rb'))
  numPages = targetPdf.getNumPages()
  print item, numPages

This always results in:

TypeError: 'str' object is not callable

If I try to recreate a pyPdf object manually, I get the same thing.

What am I doing wrong?

Tensigh
  • 1,030
  • 5
  • 22
  • 44

1 Answers1

1

Issue is due to using name, file as variable. You are using file as variable name in first for loop. And as a function call in statement, targetPdf = PdfFileReader(file(item,'rb')).

Try changing variable name in first for loop from file to fileName. Hope that helps

Kevin
  • 901
  • 1
  • 7
  • 15