1

Trying to parse PDFs into text and have been trying to start with Slate.

However, just following the basic example posted everywhere, I get the following:

>>> import slate
>>> with open('pytest.PDF') as fp:
...     doc = slate.PDF(fp)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/slate/slate.py", line 52, in __init__
self.append(self.interpreter.process_page(page))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/slate/slate.py", line 36, in process_page
self.device.outfp.buf = ''
AttributeError: 'cStringIO.StringO' object has no attribute 'buf'

Any ideas?

mtrw
  • 34,200
  • 7
  • 63
  • 71

1 Answers1

0

This can be fixed by changing line 36 where the error occurred to read:

self.device.outfp.truncate(0)
user650881
  • 2,214
  • 19
  • 31