forward fill (ffill) on a multiindex

Question

This is my dataframe, I have a multiindex on time and ID.

               +------+-------+----------+-----------+
               | col1 | col2  | col3     | col4      |
+-------+------+------+-------+----------+-----------+
| ID    | t    |      |       |          |           |
+-------+------+------+-------+----------+-----------+
| id1   | t1   | 10   | nan   |    nan   |    1      |
| id1   | t2   | 10   | 110   |      1   |    nan    |
| id1   | t3   | 12   | nan   |    nan   |    nan    |
| id2   | t1   | 12   | 109   |     15   |    1      |
| id2   | t4   | 12   | 109   |    nan   |    1      |
| id2   | t7   | 20   | nan   |    nan   |    nan    |
+-------+------+------+-------+----------+-----------+

Is it possible to do a multi-index fwd fill only on col3 and col4?

               +------+-------+----------+-----------+
               | col1 | col2  | col3     | col4      |
+-------+------+------+-------+----------+-----------+
| ID    | t    |      |       |          |           |
+-------+------+------+-------+----------+-----------+
| id1   | t1   | 10   | nan   |    nan   |     1     |
| id1   | t2   | 10   | 110   |    1     |     1     |
| id1   | t3   | 12   | nan   |    1     |     1     |
| id2   | t1   | 12   | 109   |    15    |     1     |
| id2   | t4   | 12   | 109   |    15    |     1     |
| id2   | t7   | 20   | nan   |    15    |     1     |
+-------+------+------+-------+----------+-----------+

What I've tried so far:

df[['col3','col4']].ffill()  #how to account for the multiindex?
df[['col3','col4']].fillna(df.groupby(['ID','t'])[['col3', 'col4']].ffill()) #did not work
df.reindex(['ID','t'], method='ffill') #this is probably incomplete, and I got 'expected Tuple, got str'

Should it be filled by group? (if the second 1 in col4 was nan, should it be filled?) — mozway, Sep 09 '21 at 14:19
It should be filled by group. So if the second 1 in col4 is nan, it should be nan-1-1 for id2 — noiivice, Sep 09 '21 at 14:37

kelvt · Answer 1 · 2021-09-10T05:31:15.147

I think you were close with the second attempt. I would assume you only want to group by the first level of the multi-index, ID, since you want the column values to fill forward regardless of the second level of the multi-index, t. We can use a groupby on the first level of the multi-index as below:

df1 = df.groupby(level=0)[['col3', 'col4']].ffill()

To join these results back:

df.drop(['col3', 'col4'], axis=1, inplace=True)
df = df.join(df1)

score 0 · Answer 2 · answered Sep 10 '21 at 05:36

IIUC use:

df.update(df.groupby(level=0)[['col3', 'col4']].ffill())
print (df)
        col1   col2  col3  col4
ID  t                          
id1 t1    10    NaN   NaN   1.0
    t2    10  110.0   1.0   1.0
    t3    12    NaN   1.0   1.0
id2 t1    12  109.0  15.0   1.0
    t4    12  109.0  15.0   1.0
    t7    20    NaN  15.0   1.0

forward fill (ffill) on a multiindex

2 Answers2