3

Suppose I have a dataframe that contains columns with lots and lots of nan values - in fact most values are none, except one (or a few that are identical), but are distributed along different lines. For example:

df = pd.DataFrame({'A':[np.nan, 2, np.nan], 'B':[3.5, np.nan, 3.5], 'C':[np.nan, np.nan, 0.1]})

So how can I achieve a dataframe that looks like this?

  A    B    C
0  2  3.5  0.1
1  2  3.5  0.1
2  2  3.5  0.1

'bfill' would only work for column 'C', 'ffill' only for column 'B'...

So how can I replace all the nan values in the column with the notna value present anywhere and in any number of instances in that column?

durbachit
  • 4,626
  • 10
  • 36
  • 49

1 Answers1

1

Forwardfill, backfill the dataframe.

df =df.ffill().bfill()
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • 1
    Haha, I just did that, pnly in two separate lines and felt rather silly, for sure there must be a more ellegant way of doing it. Didn't realise I can at least do it in one line :) Thanks – durbachit Oct 31 '21 at 10:35
  • 1
    I couldn't a few minutes ago... – durbachit Oct 31 '21 at 10:44