0

I have the following DataFrame:

title number
abc 3
edf 4
abc 2
edf 1

How can I produce this output? The data is grouped by title column.

title number
abc [3, 2]
edf [4, 1]

To create the same dataframe I use you can use the code below:

import pandas as pd

data = {
    "title": ['abc', 'edf', 'abc', 'edf'],
    "number": [3, 4, 2, 1],
}

df = pd.DataFrame(data=data)
R.Anchieta
  • 69
  • 1
  • 1
  • 6

1 Answers1

0

You have to use aggregation. It makes data from different rows but with same title column value gets together. Then you create a list of all items.

df is your initial dataframe object.

df.groupby("title", as_index=False).aggregate(lambda item: [i for i in item])

This code do this: First group by title column. So 3 and 2 would be together as they have the same title column value. And the same happens for the other values.

Then with aggregate method we tell python to put all the values that are together in a list.

So the final result would like this:

title number
abc [3, 2]
edf [4, 1]