0

I have some pdfs that I need to extract information from. I am using python, on centos 7 with python's lib slate. In the begining, slate works fine. But then i have to update several modules and libs. The slate lib doesn't work anymore. In order to solve the problem, i tried to update slate, and tried to use different versions, but none of them work. The error is:

File "/usr/lib64/python2.7/StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 58: ordinal not in range(128)`

When i take the slate off my code, everything works just fine.

The piece of code that i am using slate:

def adequacaoCut(pdf, person, pathInt, pathImg):
    with open('pdfs/'+pdf, 'rb') as f:
        doc = slate.PDF(f)
        print doc
        ... rest of code that works fine
  • Version of slate: 0.5.2

  • Version of python:2.7

As time pass, i dont remeber anymore which libs or updates on python, centos or whatever i did. What should I do?

Luiza Rodrigues
  • 185
  • 1
  • 2
  • 15

1 Answers1

0

I solve the problem myself. I discovery that i have two pdfminer in my computer (pdfminer and pdfminer.six). I think there were some kind of conflict between the libraries, or slate tried to call pdfminer.six instead of pdfminer. I uninstall both and re-install pdfminer only. It works as a charm now.

Luiza Rodrigues
  • 185
  • 1
  • 2
  • 15