How can I transform a list of nested dictionaries into a pandas dataframe?

Question

The list to convert to a df:

final_list = [{'ID1':{'word':'4', 'talk': '4}}, {'ID2': {'cat':'3', 'dog': '3'}}, {'ID3': {'potatoes':'8', 'height': '6'}}]

Intended output

       Word     Number  Category 
0      word     4       ID1
1      talk     4       ID1
2      cat      3       ID2
3      dog      3       ID2
4      potatoes 8       ID3
5      height   6       ID3

I had already created a dataframe where I was able to get the desired columns of Word and Number. And from this dataframe, I am trying to add the 'Category' keys of final_list as a third column. This clearly does not work because I only get the last key element when looping. This is just to show my train of thought.

My coding attempt

df = pd.DataFrame([(a, b) for item in another_list for a, b in item.items()], 
                   columns=['Word','Number'])

## add the last desired column (failed attempt)
for item in final_list:
    for k,v in item.items():
        df_events["Category"] = k

(1) First code sample (defining `final_list`) has syntax error. (2) Second code sample refers to undefined `another_list`. (3) Defined `df` but used undefined `df_events` in second code sample. — Michael Butscher, Dec 06 '19 at 12:19
Hi Michael, I think the syntax error in final_list is simply that some of the words are not represented as strings. They should be strings, in between quotes. — blah, Dec 06 '19 at 12:22
Right, but you should provide clean code with which people can work to help you. — Michael Butscher, Dec 06 '19 at 12:23
Secondly, my code is to show my train of thought - as I already described, it is based on a different list ( I will not paste the entire list). I'm wondering if there is a different way to get the intended output, even if it means to ignore my code. — blah, Dec 06 '19 at 12:23
Please read how to make a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). — Michael Butscher, Dec 06 '19 at 12:23

score 3 · Accepted Answer · answered Dec 06 '19 at 12:21

There is necessary add next for statement for flatten inner dictionaries for list of tuples:

df = pd.DataFrame([(k,v, a) for item in final_list 
                            for a, b in item.items() 
                            for k, v in b.items()],
                   columns=['Word','Number','Category'])
print (df)
       Word Number Category
0      word      4      ID1
1      talk      4      ID1
2       cat      3      ID2
3       dog      3      ID2
4  potatoes      8      ID3
5    height      6      ID3

Thank you jezrael! Had no idea how to deal with a list of nested dictionaries and convert it in the desired df - this worked out really nicely. — blah, Dec 06 '19 at 12:29

score 1 · Answer 2 · answered Dec 06 '19 at 12:23

# flatten the dictionary
flat_dict = {key: val for dct in final_list for key, val in dct.items()}
# generate dataframe
df = pd.DataFrame.from_dict(flat_dict).stack().reset_index()
# set column names
df.columns = ['Word', 'Category', 'Number']
print(df)

       Word Category Number
0       cat      ID2      3
1       dog      ID2      3
2    height      ID3      6
3  potatoes      ID3      8
4      talk      ID1      4
5      word      ID1      4

How can I transform a list of nested dictionaries into a pandas dataframe?

2 Answers2