0

I am trying to read a pdf document using tabula-py. I however have an issue;` on one of the columns, there is a line that breaks the text into a new line and ignores the remaining the text. Here is an example of a column with line breaks

enter image description here

This produces this text when read: "VALUE ADD\rVAT ON NIP\r

How do I make tabula ignore these line breaks? Here is my code:

tabula.read_pdf(file, pandas_options={"header":None}, pages='all', stream=True, lattice=True, multiple_tables=True, guess=False, password=password)

Thanks

shekwo
  • 1,411
  • 1
  • 20
  • 50

0 Answers0