Python mean value for each row grouped by a column

Question

I have a column A and a column B. In column Result I would like to calculate the mean of column B grouped by column A (which means I would like to calculate what I wrote into the column Result).

cor = pd.DataFrame({'A' : [100, 100, 100, 200, 200, 300, 300, 300, 300],
                    'B' : [10, np.NaN, 20, np.NaN, 50, 10, 40, 60, 80],
                   'Result': [15, 15, 15, 50, 50, 47.5, 47.5, 47.5, 47.5]})
print(cor)
values = cor.groupby('A').mean()

In my dataset I have about 200k rows of data, so the function should be quite powerfull.

`cor['Result'] = cor.groupby('A')['B'].transform('mean')` – ayhan Sep 16 '19 at 20:57 — ayhan, Sep 16 '19 at 20:57

score 0 · Answer 1 · answered Sep 16 '19 at 20:59

This should work:

import pandas as pd
import numpy as np
cor = pd.DataFrame({'A' : [100, 100, 100, 200, 200, 300, 300, 300, 300],
                    'B' : [10, np.NaN, 20, np.NaN, 50, 10, 40, 60, 80]})
print(cor)
values = cor.groupby('A').mean().reset_index()
print(values)
df = cor.merge(values,how='left',left_on=['A'],right_on=['A'])
df = df.rename(columns={"B_x":"B","B_y":"Result"})
print(df)

Output:

    A     B  Result
0  100  10.0    15.0
1  100   NaN    15.0
2  100  20.0    15.0
3  200   NaN    50.0
4  200  50.0    50.0
5  300  10.0    47.5
6  300  40.0    47.5
7  300  60.0    47.5
8  300  80.0    47.5

Python mean value for each row grouped by a column

1 Answers1