0

I am working project to convert pdf file into table using tabule python. Where while scanning the tabula detect such table, but one such column as table is as below in picture_1 while the actually image of table is as below picture_2picture_1

Is there any method using python to single column into separate column, like second picture.

Kedar17
  • 178
  • 2
  • 14

1 Answers1

0

You need to use str.split with expand=True.

example:

>>> import pandas as pd
>>> df = pd.DataFrame([["Purchase Balance"],["138 303"]])
>>> df
                0
0  Purchase Balance
1           138 303

>>> df[0].str.split(" ", expand=True)
         0        1
0  Purchase  Balance
1       138      303
  • But what about two string(Purchase and Value) in two row, how to combine them in one row As Purchase Value and also for other attribute also eg. Balanced Units/Nos, Average Price and many more? – Kedar17 Dec 18 '19 at 06:38
  • you can add two rows in pandas. something like `df.iloc[0] = df.iloc[0] + df.iloc[1]` and then drop the unwanted row like `df.drop(1, axis=0)`. – DHANANJAY RAUT Dec 19 '19 at 07:15