fillna in Pandas - how to choose the best method automatically?

Question

Suppose I have a dataframe that contains columns with lots and lots of nan values - in fact most values are none, except one (or a few that are identical), but are distributed along different lines. For example:

df = pd.DataFrame({'A':[np.nan, 2, np.nan], 'B':[3.5, np.nan, 3.5], 'C':[np.nan, np.nan, 0.1]})

So how can I achieve a dataframe that looks like this?

  A    B    C
0  2  3.5  0.1
1  2  3.5  0.1
2  2  3.5  0.1

'bfill' would only work for column 'C', 'ffill' only for column 'B'...

So how can I replace all the nan values in the column with the notna value present anywhere and in any number of instances in that column?

I don't, but pandas just tends to use the last not NaN value for ffill and the first not Nan value for bfill, so this would just copy the neighbouring values. — durbachit, Oct 31 '21 at 10:33

score 1 · Accepted Answer · answered Oct 31 '21 at 10:32

1

Forwardfill, backfill the dataframe.

df =df.ffill().bfill()

answered Oct 31 '21 at 10:32

wwnde

26,119
6
18
32

1

Haha, I just did that, pnly in two separate lines and felt rather silly, for sure there must be a more ellegant way of doing it. Didn't realise I can at least do it in one line :) Thanks – durbachit Oct 31 '21 at 10:35
1

I couldn't a few minutes ago... – durbachit Oct 31 '21 at 10:44

fillna in Pandas - how to choose the best method automatically?

1 Answers1