I am writing a program that is able to scrape album informations from the Discogs music database. The scraper works fine.
Now I have a Data Frame with lots of duplicated artists and titles where just the formats cell is different (see for example 'Sido', 'Ich und keine Maske' in the snippet of my data frame below).
Interpret Title Formats
0 Afrika Bambaataa And Family The Decade Of Darkness 1990-2000 CD, Album, RE
1 Sha Hef Out The Mud
2 Sido Ich Und Keine Maske CD, Album
3 Sido Ich Und Keine Maske 2xLP, Album
...
Now I am looking for a way to combine these double entries without loss of information. Can somebody give me a hint? The final result should be look like this:
Interpret Title Formats
0 Afrika Bambaataa And Family The Decade Of Darkness 1990-2000 CD, Album, RE
1 Sha Hef Out The Mud
2 Sido Ich Und Keine Maske CD, Album, 2xLP
...
I have tried
r = dataframe.groupby('Interpret')['Formate'].apply(','.join)
but the result is a Pandas Series with removed 'title'-column, so I lost information.