I'm using a text-based pdf, as required, and trying to read the tables off it using the flavor='stream' option. When I run the python script, this error shows up:
File "/path/foo.py", line x, in <module>
File "/path/foo.py", line x, in read_pdf
File "/path/foo.py", line x, in parse
self._save_page(self.filepath, p, tempdir)
File "/path/foo.py", line x, in _save_page
infile = PdfFileReader(fileobj, strict=False)
File "/path/foo.py", line x, in __init__
self.read(stream)
File "/path/foo.py", line x, in read
raise utils.PdfReadError("EOF marker not found")
PyPDF2.utils.PdfReadError: EOF marker not found
Now, I know this means End-Of-File marker, but I did not generate the pdfs I am trying to parse and it would be very inconvenient if it were a problem with the source, as they make them all the same way.
The line of code I'm using to read is this:
table = cam.read_pdf(fname, flavor='stream')
table
The last line is to display the table in the command line