3

The list to convert to a df:

final_list = [{'ID1':{'word':'4', 'talk': '4}}, {'ID2': {'cat':'3', 'dog': '3'}}, {'ID3': {'potatoes':'8', 'height': '6'}}]

Intended output

       Word     Number  Category 
0      word     4       ID1
1      talk     4       ID1
2      cat      3       ID2
3      dog      3       ID2
4      potatoes 8       ID3
5      height   6       ID3

I had already created a dataframe where I was able to get the desired columns of Word and Number. And from this dataframe, I am trying to add the 'Category' keys of final_list as a third column. This clearly does not work because I only get the last key element when looping. This is just to show my train of thought.

My coding attempt

df = pd.DataFrame([(a, b) for item in another_list for a, b in item.items()], 
                   columns=['Word','Number'])

## add the last desired column (failed attempt)
for item in final_list:
    for k,v in item.items():
        df_events["Category"] = k


blah
  • 674
  • 3
  • 17
  • (1) First code sample (defining `final_list`) has syntax error. (2) Second code sample refers to undefined `another_list`. (3) Defined `df` but used undefined `df_events` in second code sample. – Michael Butscher Dec 06 '19 at 12:19
  • Hi Michael, I think the syntax error in final_list is simply that some of the words are not represented as strings. They should be strings, in between quotes. – blah Dec 06 '19 at 12:22
  • Right, but you should provide clean code with which people can work to help you. – Michael Butscher Dec 06 '19 at 12:23
  • Secondly, my code is to show my train of thought - as I already described, it is based on a different list ( I will not paste the entire list). I'm wondering if there is a different way to get the intended output, even if it means to ignore my code. – blah Dec 06 '19 at 12:23
  • Please read how to make a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). – Michael Butscher Dec 06 '19 at 12:23

2 Answers2

3

There is necessary add next for statement for flatten inner dictionaries for list of tuples:

df = pd.DataFrame([(k,v, a) for item in final_list 
                            for a, b in item.items() 
                            for k, v in b.items()],
                   columns=['Word','Number','Category'])
print (df)
       Word Number Category
0      word      4      ID1
1      talk      4      ID1
2       cat      3      ID2
3       dog      3      ID2
4  potatoes      8      ID3
5    height      6      ID3
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Thank you jezrael! Had no idea how to deal with a list of nested dictionaries and convert it in the desired df - this worked out really nicely. – blah Dec 06 '19 at 12:29
1
# flatten the dictionary
flat_dict = {key: val for dct in final_list for key, val in dct.items()}
# generate dataframe
df = pd.DataFrame.from_dict(flat_dict).stack().reset_index()
# set column names
df.columns = ['Word', 'Category', 'Number']
print(df)

       Word Category Number
0       cat      ID2      3
1       dog      ID2      3
2    height      ID3      6
3  potatoes      ID3      8
4      talk      ID1      4
5      word      ID1      4
mcsoini
  • 6,280
  • 2
  • 15
  • 38