-1

I have df as follows:

df = pd.DataFrame({"A":[0,np.nan,0,0,np.nan,1,np.nan,1,0,np.nan],
                   "B":[0,1,0,0,1,1,1,0,0,0]})

Now, I need to replace nan values in column A with values from column B and one above row. for example: 2nd row for column A should be 0, 7th row equals to 1 etc.

I defined this function but it doesnt work trying to apply into dataframe

def impute_with_previous_B(df):
    for x in range(len(df)):
        if pd.isnull(df.loc[x,"A"]) == True:
            df.loc[x,"A"] = df.loc[x-1,"B"]

df["A"] = df.apply(lambda x: impute_with_previous_B(x),axis=1)

Can you please tell me what is wrong with that function ?

Vladimir Fokow
  • 3,728
  • 2
  • 5
  • 27
szaki
  • 3
  • 1
  • 1) `x` is an index (from 0 to `len(df)`), but you are using the `.loc` accessor. 2) The main problem though is: don't call your function inside of `.apply()`. Just call it on its own. 3) Avoid iteration with pandas. – Vladimir Fokow Aug 27 '22 at 12:06

1 Answers1

1
df['A'] = df['A'].fillna(df['B'].shift())


     A  B
0  0.0  0
1  0.0  1
2  0.0  0
3  0.0  0
4  0.0  1
5  1.0  1
6  1.0  1
7  1.0  0
8  0.0  0
9  0.0  0
Vladimir Fokow
  • 3,728
  • 2
  • 5
  • 27