I am trying to parse a pdf into dataframe using camelot
import camelot
import pandas as pd
file = 'foo.pdf'
tables = camelot.read_pdf(file, pages='2', flavor='stream')
v = []
for i, table in enumerate(tables):
v.append(table.df)
w = pd.concat(v)
print(w)
however, its reading as below:
7 Customer No. Document Date Customer PO No. External Doc. No.\nPayment Terms
8 126207 28/02/22 STRICTLY 14 DAYS
9 PO No./Docket Unit Price \nAmount \nGST Amount Incl.
10 Description TASK DATE Quantity UOM
11 No. Excl. GST\nExcl. GST\nAmount GST
12 BOC GAS & GEAR
13 11 SNOW STREET
14 SOUTH LISMORE, NSW 2480
15 CLEAR: FL 1.5M3 BIN-CARDBOARD 02/02/22 1\nEA\n9.18\n9.18\n0.92 10.10
16 CLEAR: FL 1.5M3 BIN-CARDBOARD 16/02/22 1\nEA\n9.18\n9.18\n0.92 10.10
How do I avoid the newline \n
when reading the pdf?