2

I have a dictionary that looks like that:

dic = {'a': {'b': [1,2], 'c': [3,4]}, 'A': {'B': [10,20], 'C': [30, 40]}}

I would like to get a 2 dim dataframe with 3 columns that looks like that:

'a' 'b'  1  
'a' 'b'  2  
'a' 'c'  3  
'a' 'c'  4  
'A' 'B'  10  
'A' 'B'  20  
'A' 'C'  30  
'A' 'C'  40  
some_programmer
  • 3,268
  • 4
  • 24
  • 59
user25640
  • 225
  • 2
  • 10
  • @Xilpex the question is specifically how to make `pandas` do this (which is why the term `dataframe` is being used and the question is tagged with `pandas`). – Karl Knechtel Apr 20 '20 at 19:47

3 Answers3

7

You can try this:

s=pd.DataFrame(d).stack().explode().reset_index()
  level_0 level_1   0
0       b       a   1
1       b       a   2
2       c       a   3
3       c       a   4
4       B       A  10
5       B       A  20
6       C       A  30
7       C       A  40
halfer
  • 19,824
  • 17
  • 99
  • 186
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thank you, that seems like a very neat solution. Unfortunately explode only works for version 0.25 and up, and we are using 0.24.2. Is there a nice way to do this in older versions? – user25640 Apr 21 '20 at 07:08
  • 1
    @user25640 my self-def function https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe/53218939#53218939 – BENY Apr 21 '20 at 13:05
1

Using list comprehension:

import pandas as pd

dic = {'a': {'b': [1,2], 'c': [3,4]}, 'A': {'B': [10,20], 'C': [30, 40]}}

data = [
    (val_1, val_2, val_3)
    for val_1, nest_dic in dic.items()
    for val_2, nest_list in nest_dic.items()
    for val_3 in nest_list
]
df = pd.DataFrame(data)

print(df)
# Output:
#    0  1   2
# 0  a  b   1
# 1  a  b   2
# 2  a  c   3
# 3  a  c   4
# 4  A  B  10
# 5  A  B  20
# 6  A  C  30
# 7  A  C  40
Xukrao
  • 8,003
  • 5
  • 26
  • 52
1

Like this maybe:

In [1845]: pd.concat({k: pd.DataFrame(v).T for k, v in dic.items()},axis=0).reset_index()                                                                                                                   
Out[1845]: 
  level_0 level_1   0   1
0       a       b   1   2
1       a       c   3   4
2       A       B  10  20
3       A       C  30  40
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58