0

Tried to extract the proper readable table from the pdf. But the tabula was not working properly and unable to extract the table properly.

I have tried using the parameters like stream, lattice, guess. But none worked.

Any suggestions on how can i fix this?

Code: 
from tabula import read_pdf 
from tabulate import tabulate
from tabula import read_pdf
import pandas as pd
import numpy as np

# guess=False, pages=page,stream=True

Page_No = 184
# tables = read_pdf('/content/210812155050_DECK CRANE - MACGREGGOR HAGGLUND - GL3024-2S,2424 - 62503091 - Manuals.pdf',  pages=Page_No)
tables = read_pdf('/content/210812155050_DECK CRANE - MACGREGGOR HAGGLUND - GL3024-2S,2424 - 62503091 - Manuals.pdf', 
                  multiple_tables=True,lattice=True, pages=Page_No)
data_df = pd.DataFrame(tables[0])

PDF table snapshot: enter image description here

The Output I got:

enter image description here

Pravin
  • 241
  • 2
  • 14

0 Answers0