1

I have a df like below:

df = pd.DataFrame({'id': ['a','b','c','d'],
                   'ac' : ['123','223', np.nan, np.nan],
                   'prev' : [np.nan, np.nan, 'a','b']})

Output:

    id  ac   prev
0   a   123  NaN  
1   b   223  NaN  
2   c   NaN  a  
3   d   NaN  b

For ac are null, get prev's value, and then look up at the id column. Fill in the null with value at ac column.

Expected output:

    id  ac   prev
0   a   123  NaN
1   b   223  NaN
2   c   123  a
3   d   223  b

How do I achieve this? Thanks.

Jun_1812
  • 13
  • 3
  • 1
    Does this answer your question? [forward fill specific columns in pandas dataframe](https://stackoverflow.com/questions/27012151/forward-fill-specific-columns-in-pandas-dataframe) – Trenton McKinney Sep 07 '20 at 14:27
  • Does this answer your question? [pandas ffill based on condition in another column](https://stackoverflow.com/questions/48816457) – Trenton McKinney Sep 07 '20 at 14:31
  • 1
    @TrentonMcKinney The 2 suggested for forward / backward fill. I need to look up the value from the row where id = prev so the suggested answer cannot achieve what I needed. – Jun_1812 Sep 09 '20 at 02:44

2 Answers2

2

You can use Series.isna to create a boolean mask the use boolean indexing with this mask to map the values of prev column to ac column based on id:

m = df['ac'].isna()
df.loc[m, 'ac'] = df.loc[m, 'prev'].map(df.set_index('id')['ac'])

Result:

  id   ac prev
0  a  123  NaN
1  b  223  NaN
2  c  123    a
3  d  223    b
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
1

if I understood correctly you would like to fill the nan values of the 'ac' column with the value given by the 'id' indicated by the corresponding value at the 'prev' column. If so you could try using 'id' as your index, so that you can acces the rows that you are interested in by using .loc

import pandas as pd
import numpy as np

df = pd.DataFrame({'id': ['a', 'b', 'c', 'd'],
                   'ac': ['123', '223', np.nan, np.nan],
                   'prev': [np.nan, np.nan, 'a', 'b']})


df.set_index('id', inplace=True)
to_fill = df.loc[df['ac'].isnull()]
fill_with = df.loc[to_fill['prev'], 'ac'].copy()
fill_with = fill_with.values
df.loc[to_fill.index, 'ac'] = fill_with

Which ends up being:

print(df)
     ac prev
id          
a   123  NaN
b   223  NaN
c   123    a
d   223    b

Note that this code works if your 'id' column has no duplicated values. Hope I helped!

BrunoSE
  • 94
  • 4