-1

I use pdfminer to convert pdf-text into txt. The pdfminer goes through the pdf-file and reads it out line by line. Each line is assigned to a matrix variable. The problem is, that for some reason in rare cases the matrix is for e. g. like x =

[[Г, 'problems', -436, 'have', -448, 'usually', -435, 'found', -452]]

Obviously Г without quotes is an invalid syntax for a matrix (or list). However, x exists but is not accessible to delete Г, understandably del x[0][0] does not work.

Now I'm asking for ideas how to access x and remove the first element. Many thanks in advance!

1 Answers1

0

I solved my problem with:

from ast import literal_eval
mr_x = str(x)
quote_pos = mr_x.find("'")
mr_x = '[[' + mr_x[quote_pos:]
x = literal_eval(mr_x)
print x

[['problems', -436, 'have', -448, 'usually', -435, 'found', -452]]