The previous value in each group is padded with missing values

Question

If there are three columns of data, the first column is some category id, the second column and the third column have some missing values, I want to aggregate the id of the first column, after grouping, fill in the third column of each group by the method: 'ffill' Missing value

I found a good idea here: Pandas: filling missing values by weighted average in each group! , but it didn't solve my problem because the output it produced was not what I wanted

Enter the following code to get an example:

import pandas as pd
import numpy as np
df = pd.DataFrame({'name': ['A','A', 'B','B','B','B', 'C','C','C'],'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3],
              'sss':[1, np.nan, 3, np.nan, np.nan, np.nan, 2, np.nan, np.nan]})
Out[13]:
    name    value   sss
0   A      1.0     1.0
1   A      NaN     NaN
2   B      NaN     3.0
3   B      2.0     NaN
4   B      3.0     NaN
5   B      1.0     NaN
6   C      3.0     2.0
7   C      NaN     NaN
8   C      3.0     NaN

Fill in missing values with a previous value after grouping

Then I ran the following code, but it outputs strange results：

df["sss"] = df.groupby("name").transform(lambda x: x.fillna(axis = 0,method = 'ffill'))
df
Out[13]:
    name    value   sss
0   A      1.0     1.0
1   A      NaN     1.0
2   B      NaN     NaN
3   B      2.0     2.0
4   B      3.0     3.0
5   B      1.0     1.0
6   C      3.0     3.0
7   C      NaN     3.0
8   C      3.0     3.0

The result I want is this：

Out[13]:
    name    value   sss
0   A      1.0     1.0
1   A      NaN     1.0
2   B      NaN     3.0
3   B      2.0     3.0
4   B      3.0     3.0
5   B      1.0     3.0
6   C      3.0     2.0
7   C      NaN     2.0
8   C      3.0     2.0

Can someone point out where I am wrong?strong text

what about `df["sss"]=df.groupby('name').sss.ffill()` ? your method can be changed to `df.groupby("name").sss.transform(lambda x: x.fillna(axis = 0,method = 'ffill'))` — anky, May 17 '19 at 09:37

The previous value in each group is padded with missing values

0 Answers0