Find the mean based on similar other columns in Pandas

Question

I was looking for a way to find the mean of numerical values based on a certain column. I looked to this link for advice but it requires all values of one column to be the same and I'm sure there's a Pythonic way that would do that for all values that are duplicates.

Here's an example.

data = {
  "Name": ["John", "John", "Robert", "Robert", "Cindy", "Cindy", "Sarah", "Sarah"],
  "Score": [84, 45, 67, 87, 88, 100, 76, 91]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

df

I'd like it so there's one row of John with whatever the mean of John is. Same with Robert, Cindy and Sarah.

Thanks!

`df.groupby('Name', as_index=False, sort=False).mean()` – mozway Nov 03 '22 at 19:48 — mozway, Nov 03 '22 at 19:48
Thanks everyone for your awesome responses! – E_Sarousi Nov 04 '22 at 17:11 — E_Sarousi, Nov 04 '22 at 17:11

score 1 · Accepted Answer · answered Nov 03 '22 at 19:48

1

# groupby and mean
df.groupby('Name', as_index=False)['Score'].mean()

Name    Score
0   Cindy   94.0
1   John    64.5
2   Robert  77.0
3   Sarah   83.5

answered Nov 03 '22 at 19:48

Naveed

11,495
2
14
21

score 1 · Answer 2 · answered Nov 03 '22 at 19:50

1

new_df = df.groupby(['Name'], as_index=False).agg({'Score': pd.Series.mean})

     Name  Score
0   Cindy   94.0
1    John   64.5
2  Robert   77.0
3   Sarah   83.5

answered Nov 03 '22 at 19:50

andrew

81
1
8

Find the mean based on similar other columns in Pandas

2 Answers2