5

I am getting this error "No /Root object! - Is this really a PDF?" using my MAC computer with Python 2.7 and PDFMiner version 20110515. The pdf files are not damaged because the same program with the same files works on my PC computer! Also I have tried many pdfs and this error exists for all of them. Any ideas of what I should change in my MAC to not to get this error?

martineau
  • 119,623
  • 25
  • 170
  • 301
Mahshid Zeinaly
  • 3,590
  • 6
  • 25
  • 32
  • 2
    Version 20110515 of PDFMiner is a Beta release, so it may have bugs. Fortunately it's pure Python, which can help make debugging easier. The problem you describe may be due to the way end-of-lines are being handled in the files being parsing. Make sure they're being opened in binary mode, i.e. `fp = open('mypdf.pdf', 'rb')`. It may also be helpful to run the included `dumppdf.py` utility on the problem files. Lastly, the error may be due to that fact that the Python interpreter you're varies from machine to machine. Universal newline support isn't built-in to all versions of Python. – martineau Jun 27 '13 at 06:09

1 Answers1

5

I found the source of the problem:

I had a method to read all the files in a directory and parse them. Turns out that I had one hidden file in that directory that was not a pdf file!

Here is how I fixed the problem:

for filename in os.listdir(INPUT_DIR_NAME):
    if filename.endswith('.pdf'):
        #do stuff!
Mahshid Zeinaly
  • 3,590
  • 6
  • 25
  • 32