I have a movie rating/watchedenter code here
dataset with columns userid, movieId and timestamp.
I want to groupby the dataframe according to the users and each row should contain the movies not more than a certain time (lets say 500 in my case ) but no of items at each entry should not be greater than 100.
input_data={'userId':[1,1,1,2,2,3,3,3,1,1],'movieId':[10,20,30,40,50,60,70,80,90,100],'timestamp':[100,200,300,400,500,600,700,800,900,1000]}
input_df=pd.DataFrame(columns=['userId','movieId','timestamp'],data=input_data)
input_df
The Output should look like:
output_data={'userId':[1,2,3,1],'movies':[[10,20,30],[40,50],[60,70,80],[90,100]]}
output_df=pd.DataFrame(columns=['userId','movies'],data=output_data)
output_df