I have a pandas DataFrame like following (but with 1,000 different IDs):
df1 = pd.DataFrame({'ID': [1,1,1,2,2,2,2,3,3,3,4,4,4,4,5,5],
'VALUE': ['first', 'second', 'third',
'second', 'second', 'first', 'fourth',
'first', 'second', 'first',
'third', 'third', 'third', 'first',
'second', 'first']})
I want to get the first row of each group but keeping the ID:
ID VALUE
0 1 first
1 1 second
2 1 third
3 2 second
4 2 second
5 2 first
6 2 fourth
7 3 first
8 3 second
9 3 first
10 4 third
11 4 third
12 4 third
13 4 first
14 5 second
15 5 first
Expected Outcome:
ID VALUE
0 1 first
1 1 first
2 1 first
3 2 second
4 2 second
5 2 second
6 2 second
7 3 first
8 3 first
9 3 first
10 4 third
11 4 third
12 4 third
13 4 third
14 5 second
15 5 second
I tried using df1.gropupby('ID').first() but it won't let me create a new variable with the expected outputand include it in df1 because operands could not be broadcast together with different shapes.