I want to groupby my id column, and then sort by the year column, and then finally collapse the data in such a way that I have a list of years in the years column and a list of another data point in another column while preserving the order.
For example:
We would start with a data frame like
df = pandas.DataFrame( {
"id" : [2957, 1456, 2223, 1456, 1456, 2957, 2223, 2957, 2223] ,
"year" : [2018, 2017, 2018, 2016, 2018, 2016, 2017, 2017, 2016],
"rating" : [3, 1, 2, 2, 3, 2, 3, 1, 1] }
Which would appear in a data frame like
id year rating
0 2957 2018 3
1 1456 2015 1
2 2223 2018 2
3 1456 2016 2
4 1456 2017 3
5 2957 2016 2
6 2223 2017 3
7 2957 2017 1
8 2223 2016 1
I would then like to sort the data by year ascending, group the data by id, and then condense the data in such a way that there would be a single row for each unique id where the year and rating columns would be collapsed into lists with their orders preserved.
The end result would look something like
id year rating
0 2957 [2016, 2017, 2018] [2,1,3]
1 1456 [2015, 2016, 2017] [1,2,3]
2 2223 [2016, 2017, 2018] [1,3,2]
I understand how to group by and sort by characters, however, I am having trouble condensing the data into lists.