3

I'm trying to open a PDF with pyPdf. I get the following error:

pyPdf.utils.PdfReadError: EOF marker not found

I thought that I should add the EOF myself. However, I don't want to write bytes. Isn't it OS specific? I want to call something like os.eof(). What do I write? This thread is not helpful.

Community
  • 1
  • 1
anonymous
  • 193
  • 2
  • 5
  • 1
    Are you sure this error message refers to an actual EOF character and not some special PDF-specific EOF construct? – Cameron Jan 30 '13 at 07:08
  • As far as I am aware there is no byte you can write to explicitely put an EOF. The EOF is where the file ends. I’m quite sure that you have a different problem. – poke Jan 30 '13 at 07:09
  • See maybe: http://code.activestate.com/lists/python-list/589529/ – poke Jan 30 '13 at 07:11

1 Answers1

4

PDF's EOF marker is a special string (%%EOF) that needs to appear in your PDF file. If it doesn't, you have a malformed PDF. This string separates the actual PDF contents from any additional data (embedded files etc.).

It has nothing to do with the EOF event you run into when reading any file up to its end.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • @anonymous Maybe it is not too malformed and can be repaired. As explained in Tim's link, sometimes e.g. there merely are some trash bytes added to the end of the file (Adobe software did not require the EOF marker to be at the end of the file, only in the final KB or so, so some people thought they could append their own information at the end). Thus, please post the PDF for inspection. – mkl Jan 30 '13 at 08:43
  • 1
    So do you have an actual solution for adding '%%EOF' to the file and not having the mentioned `PdfReadError`? Is it enough adding this at the end of the payload? – lajarre Dec 05 '14 at 14:02
  • @lajarre: I don't think there can be a simple solution - if the PDF file is malformed in the first place, it's unlikely that simply slapping `%%EOF%` onto it will make it valid. Worth a try, but it would make more sense to fix whatever is producing the bad PDF. – Tim Pietzcker Dec 06 '14 at 08:20