0

I am relatively new to python and I am using yfinance and pandas to scrape financial data from yahoo finance. I will provide code below, but basically, I am wondering how I can get, inside of the database, a line of code that will return the number associated with 'Capital Expenditure' for example. I am really lost on how to do this, I'm not even sure how to ask my question since in Excel I'll typically use =sumif() functions to return these values. I have provided my code below.

import yfinance as yf
import pandas as pd
import requests
import numpy as np
import datetime as dt

msft = yk.Ticker("MSFT")

df = msft.cashflow

print(df)

What will happen here is that I am returning the statement of cash flows into this database here, and then I want to just isolate the dollar amount associated with 'Capital Expenditure' for this year. Here is a picture of the database below.

I've highlighted in red what I'm trying to do here in this image.

Any help would be truly and greatly appreciated, since this is really just the final step before my school project is finalized.

3 Answers3

0

What type of database are you using? You should lookup python libraries associated with your database and follow the directions to make the database connection. Then you can return anything from your database directly into a pandas data frame.

rastawolf
  • 338
  • 2
  • 11
0

Assuming you already have the data loaded into the Data Frame “df”.

To get the ‘Capital Expenditure’ row:

row =  df.loc[df.iloc[:, 0] == ‘Capital Expenditure’]

(df.iloc[:, 0] is the first column in the table from the picture you shared) I’m assuming the data frame looks the same as that picture.

Then, to get a specific date value:

value = row[‘2021-6-30’]

Good luck on the project!

rikster
  • 71
  • 6
0

So in case anyone was wondering, I ended up figuring out how to do this. yfinance already loads data into pandas databases, so to find each, you would first have to locate the variable name, and then index locate where the data is drawing from.

For example, assuming that the data for the Statement of Cash Flows is already loaded into the dataframe, variable = df

#This will locate the row by name.
Capex = df.loc['Capital Expenditure'] 

#This will locate the value by index for this year.
CapitalExpenditure = Capex.iloc[0]

Or, if I put it into a class structure that I could call at any point in time, I would have:

class CapitalExpenditure:
    def __init__(self):
        self.MSFT = yf.Ticker("MSFT")
        self.df = self.MSFT.cashflow
        self.Capex = self.df.loc['Capital Expenditures']
        self.CapitalEx = self.Capex.iloc[0]
    def CapEx(self):
        return 'Capital Expenditure for this year is ${}'.format(abs(self.CapitalEx))
a = CapitalExpenditure()
print(a.CapEx())