How to convert PDF to excel using tabula-py into dataframe of several tables?

Asked Feb 03 '21 at 23:16

Active Feb 03 '21 at 23:16

Viewed 340 times

I have a PDF file where are several tables, For example: Table from PDF File

By the way, I learned that I have to use tabula-py from Java (Note: I'm working on Jupyter Notebook So, I code this: import pandas as pd import numpy as np

import tabula from tabula import read_pdf

pdf_path = "..\PDFs\pobreza2.pdf" #File direction

df=tabula.read_pdf(pdf_path, pages="all", stream=True, guess=False, multiple_tables=True) #PDF have many pages with several tables

And I get this: Output of the code

It's like a list and not a dataframe

So, how could I get this table into a Dataframe? The tables have string and float object

asked Feb 03 '21 at 23:16

Maria Fernanda

How to convert PDF to excel using tabula-py into dataframe of several tables?

0 Answers0