Converting a Dict of Dicts to Dataframe

Question

I am attempting to loop a list of ticker symbols through IEX API in order to obtain Price to Book ratios for a large group of tickers. The ultimate goal is to have all of the Price to Book Ratios in one csv file. For multiple tickers it returns the balance sheet information in the form of a dict of dicts which I dont have a lot of experience.

The following code worked with just a single ticker in which the balance sheet returns are in a dataframe.

import pandas as pd
from iexfinance.stocks import Stock
from datetime import datetime
import matplotlib.pyplot as plt
from iexfinance.stocks import get_historical_data
from pathlib import Path
import numpy as np
import csv

output_file = pd.DataFrame()

filename = r'C:\Users########'

Ticker = []
with open(filename, newline='') as inputfile:
    for row in csv.reader(inputfile):
        Ticker.append(row[0])

for i in Ticker:
#Obtain information for ticker company when using a single ticker
#Ticker = ['FRT']

    #Obtaining Balance Sheets
    stock = Stock(Ticker, token='##################', output_format='pandas')
    bs = stock.get_balance_sheet()

    #Obtaining the common stock outstanding on the last filing
    common_stock_outstanding = stock.get_shares_outstanding()

    #Obtaining the current price
    current_price = stock.get_price()

    #Obtaining the Market Cap
    market_cap = common_stock_outstanding * current_price

    #Obtaining the BV per share
    bs['BV'] = bs['shareholderEquity'] / common_stock_outstanding
    bs.drop(bs.iloc[:, 0:38], inplace = True, axis = 1)
    BV = pd.DataFrame(bs)
    BV.columns = ['1']
    BV = BV.reset_index(drop=True)

    #Obtaining the Price to Book
    current_price = pd.DataFrame(current_price)
    current_price = current_price.reset_index(drop=True)
    current_price.columns = ['2']
    current_price

    PtB = pd.merge(BV, current_price, left_index=True, right_index=True)

    PtB['PtB'] = PtB['2'] / PtB['1']
    PtB.drop(PtB.iloc[:, 0:2], inplace = True, axis = 1)
    PtB.columns = [Ticker]

    #Appending the PtB into the Output File
    output_file = pd.DataFrame()
    output_file = output_file.append(PtB)
    output_file

This is the return output for two Tickers RFP and APA:

 {'RFP':             accountsPayable capitalSurplus  commonStock currency  \
 2021-03-01        251000000           None     82200000      USD   

             currentAssets  currentCash  currentLongTermDebt filingType  \
 2021-03-01      900000000    113000000              2000000       10-K   

             fiscalDate  fiscalQuarter  ...  symbol  totalAssets  \
 2021-03-01  2020-12-31              4  ...     RFP   3730000000   

             totalCurrentLiabilities  totalLiabilities  treasuryStock  \
 2021-03-01                380000000        2649000000     -174000000   

                        id  key     subkey           date        updated  
 2021-03-01  BALANCE_SHEET  RFP  quarterly  1609372800000  1614775188000  

 [1 rows x 38 columns],
 'APA':             accountsPayable capitalSurplus  commonStock currency  \
 2020-07-30        459000000           None  12006000000      USD   

             currentAssets  currentCash  currentLongTermDebt filingType  \
 2020-07-30     1658000000    135000000            294000000       10-Q   

             fiscalDate  fiscalQuarter  ...  symbol  totalAssets  \
 2020-07-30  2020-06-30              2  ...     APA  12999000000   

             totalCurrentLiabilities  totalLiabilities  treasuryStock  \
 2020-07-30               1710000000       14634000000    -3189000000   

                        id  key     subkey           date        updated  
 2020-07-30  BALANCE_SHEET  APA  quarterly  1593475200000  1608330865000  

 [1 rows x 38 columns]}

I attempted to use pd.DataFrame.from_dict(bs, orient='index') to get the dict of dicts into a dataframe but recieved the error:

KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-17-b29d66f7b6fd> in <module>
----> 1 pd.DataFrame.from_dict(bs, orient='index')

~\Anaconda3\lib\site-packages\pandas\core\frame.py in from_dict(cls, data, orient, dtype, columns)
   1188             raise ValueError("only recognize index or columns for orient")
   1189 
-> 1190         return cls(data, index=index, columns=columns, dtype=dtype)
   1191 
   1192     def to_numpy(self, dtype=None, copy=False):

 ~\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    462                     mgr = arrays_to_mgr(arrays, columns, index, columns, dtype=dtype)
    463                 else:
--> 464                     mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
    465             else:
    466                 mgr = init_dict({}, index, columns, dtype=dtype)

 ~\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in init_ndarray(values, index, 
 columns, dtype, copy)
    169     # by definition an array here
    170     # the dtypes will be coerced to a single dtype
--> 171     values = prep_ndarray(values, copy=copy)
    172 
    173     if dtype is not None:

~\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in prep_ndarray(values, copy)
    274         try:
    275             if is_list_like(values[0]) or hasattr(values[0], "len"):
--> 276                 values = np.array([convert(v) for v in values])
    277             elif isinstance(values[0], np.ndarray) and values[0].ndim == 0:
    278                 # GH#21861

~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
    2978             if self.columns.nlevels > 1:
    2979                 return self._getitem_multilevel(key)
 -> 2980             indexer = self.columns.get_loc(key)
    2981             if is_integer(indexer):
    2982                 indexer = [indexer]

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
    2897                 return self._engine.get_loc(key)
    2898             except KeyError:
 -> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
    2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
    2901         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

I understand the code may be constructed inefficiently but at this point I just want to understand how to work with this dict of dicts output. There has to be a way to do this but it is beyond my skills at this point. Thanks for your help.

score 0 · Answer 1 · answered Mar 22 '21 at 13:04

0

Until you provide the exact dictionary that you are trying to create a data frame from, I don't think we are going to be able to help you. Anyways, as a sample, look at the following example:

>>data = {'name': {'first':'hary', 'second':'porter'}, 'address':{'temp': 'USA', 'perm': 'UK'}}
>>data
{'name': {'first': 'hary', 'second': 'porter'}, 'address': {'temp': 'USA', 'perm': 'UK'}}
>>pd.DataFrame(data)
          name address
first     hary     NaN
second  porter     NaN
temp       NaN     USA
perm       NaN      UK

answered Mar 22 '21 at 13:04

ThePyGuy

17,779
5
18
45

Thank you for the example but that doesnt help as far as being able to iterate through a large list and get it to a df. Just using your example of pd.DataFrame() it throws a error of: --------------------------------------------------------------------------- ValueError: If using all scalar values, you must pass an index – crm2730 Mar 23 '21 at 16:46

score -1 · Answer 2 · edited Mar 22 '21 at 15:15

-1

Dataframe also take dictionary of argument.

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)

Output :

col1=1,2  col2 = 3,4

edited Mar 22 '21 at 15:15

Jin Lee

3,194
12
46
86

answered Mar 22 '21 at 13:17

Mk_8780

1
1

Converting a Dict of Dicts to Dataframe

2 Answers2