0

Is there an optimal way to do something like this?

Lets say I have the following DataFrame:

    A   B
0   1   1
1   1   2
2   2   3
3   2   4
4   2   5

I would like to get a dictionary like this:

{1: [1, 2], 2:[3, 4, 5]}

Keep in mind that the lists have different lengths because the value 1 appears two times and the value 2 appears three times. If I try

df.set_index('A').to_dic('list')

Pandas only keeps the last value in B for each value in A, returning the following dict:

{1:[2], 2:[5]
sgaseretto
  • 421
  • 5
  • 13

2 Answers2

2

Use DataFrame.groupby with GroupBy.apply with list for Series and then Series.to_dict:

d = df.groupby('A')['B'].apply(list).to_dict()
print (d)
{1: [1, 2], 2: [3, 4, 5]}
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

You could group by A and the convert the values in B to a list:

result = {key: group['B'].tolist() for key, group in df.groupby('A')}
print(result)

Output

{1: [1, 2], 2: [3, 4, 5]}
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76