1

I'm trying to write a dataframe file to a csv using pandas. I'm getting the following error AttributeError: 'list' object has no attribute 'to_csv'. I believe I'm writing the syntax correctly, but could anyone point out where my syntax is incorrect in trying to write a dataframe to a csv? This is link the link of the file: https://s22.q4cdn.com/351912490/files/doc_financials/quarter_spanish/2018/2018.02.25_Release-4Q18_ingl%C3%A9s.pdf Thanks for your time!

import tabula 
from tabula import read_pdf
import pandas as pd
from pandas import read_json, read_csv


a = read_pdf(r"C:\Users\Emege\Desktop\micro 1 true\earnings_release.pdf",\
            multiple_tables= True, pages = 15, output_format = "csv",\
            )
print(a)

a.to_csv("a.csv",header = False, index = False,  encoding = "utf-8")

enter image description here

dsilva
  • 93
  • 9
  • 1
    What's the output of A? Does it look like a dataframe? My assumption would be to simply pass it into a data frame with `pd.DataFrame` – Umar.H Mar 31 '19 at 23:05
  • The content of this page are three tables in pdf format, so i want to pass those tables into a csv format and then order that with pd.read_csv(). but the problem is that i get lists, not dataframes. So the problem is how to arrive to the dataframe – dsilva Mar 31 '19 at 23:13
  • 1
    @damiansilva, could you post some contents of `a` directly into your question, instead of requiring visitors to download the PDF and run your code? As @Datanovice suggested, the first thing to try is to pass `a` to the DataFrame constructor and see what happens: `df = pd.DataFrame(a); df.to_csv('a.csv')` – Peter Leimbigler Mar 31 '19 at 23:32
  • Sorry guys, i´m new here. I put the content of that page in a image. I tried what Datanovice says, but it doesn´t work becuase of i´m lossing values. – dsilva Mar 31 '19 at 23:58
  • 1
    no worries Damian, just had a quick look (had to install tabula-py and java [and add java to path!!]) and it seems you get nested lists with different sized table columns. You'll need to do some manual clean-up before passing this into Pandas try passing `pd.DataFrame(a[0])` which are your headers. – Umar.H Apr 01 '19 at 00:07
  • Ok, i will try that way. – dsilva Apr 01 '19 at 00:17
  • @Peter Leimbigler do you recommend cropping the pdf and work with the tables individually? – dsilva Apr 01 '19 at 01:00

0 Answers0