pandas: fillna whole df with groupby

Question

I have the following df with a lot more number columns. I now want to make a forward filling for all the columns in the dataframe but grouped by id.

id  date  number    number2
1   2001    4         11
1   2002    4         45
1   2003    NaN       13
2   2001    7         NaN
2   2002    8         2

The result should look like this:

id  date  number    number2
1   2001    4         11
1   2002    4         45
1   2003    4         13
2   2001    7         NaN
2   2002    8         2

I tried the following command:

df= df.groupby("id").fillna(method="ffill", limit=2)

However, this raises a KeyError "isin". Filling just one column with the following command works just fine, but how can I efficiently forward fill the whole df grouped by isin?

df["number"]= df.groupby("id")["number"].fillna(method="ffill", limit=2)

Can you post the full error message? – ayhan Jan 22 '18 at 15:29 — ayhan, Jan 22 '18 at 15:29

score 1 · Accepted Answer · answered Jan 22 '18 at 15:26

1

You can use:

df = df.groupby("id").apply(lambda x: x.ffill(limit=2))
print (df)
   id  date  number  number2
0   1  2001     4.0     11.0
1   1  2002     4.0     45.0
2   1  2003     4.0     13.0
3   2  2001     7.0      NaN
4   2  2002     8.0      2.0

Also for me working:

df.groupby("id").fillna(method="ffill", limit=2)

so I think is necessary upgrade pandas.

answered Jan 22 '18 at 15:26

jezrael

822,522
95
1,334
1,252

Thanks, I did upgrade pandas and the fillna command works. However, it also drops the id column in the new dataframe. How could I avoid that? – freddy888 Jan 22 '18 at 15:55
Interesting, for me it working nice. `.reset_index()` does work? – jezrael Jan 22 '18 at 16:02
No, still drops the id column. But the apply function works for me. – freddy888 Jan 22 '18 at 16:07
@FriedrichFranz Same here. groupby.fillna drops the id column but groupby.ffill or groupby.bfill seem to not do that. Seems like inconsistent behaviour. – Roobie Nuby Mar 21 '18 at 22:49

score 0 · Answer 2 · answered Jan 22 '18 at 15:27

0

ffill can be use directly

df.groupby('id').ffill(2)
Out[423]: 
   id  date  number  number2
0   1  2001     4.0     11.0
1   1  2002     4.0     45.0
2   1  2003     4.0     13.0
3   2  2001     7.0      NaN
4   2  2002     8.0      2.0

#isin

#df.loc[:,df.columns.isin([''])]=df.loc[:,df.columns.isin([''])].groupby('id').ffill(2)

answered Jan 22 '18 at 15:27

BENY

317,841
20
164
234

Hi @beny, I am trying to use kinda same method for planets dataset for imputation, but it's not imputing all the values. Not sure why: https://stackoverflow.com/questions/73449902/fill-in-missing-values-with-groupby/73450241 – Roy Aug 26 '22 at 19:00

pandas: fillna whole df with groupby

2 Answers2