0

I am trying extract data as HTML from pdf using pdfminer although I was successful to extract text from the same pdf now I am getting an error while extracting data as HTML I have to filter the data further to categorize it in CSV. This is the script.

from io import StringIO  
from pdfminer.layout import LAParams  
from pdfminer.high_level import extract_text_to_fp  

output_string = StringIO  

with open('mini.pdf','rb') as fn:  
    extract_text_to_fp(fn, output_string, laparams=LAParams(), output_type='html', codec=None)

And this is the error I am getting. Click Here

1 Answers1

2

Add parentheses to StringIO this way: output_string = StringIO() that will call the class construction, and code could get working with this

Mna
  • 370
  • 4
  • 7