Filling cell values horizontally in Pandas dataframe

Question

I know about bfill and ffill to fill values in rows of the same column. But how do you do it when you need to fill values across certain multiple columns in a dataframe?

Here's the example:

Initial df:

import pandas as pd
inidf = [('Prod', ['P1', 'P2']),
 ('A', ['1', '1']),
 ('1', ['', '40']),
 ('2', ['10', '60']),
 ('3', ['30', '']),
 ('B', ['1', '2']),             
 ]
df = pd.DataFrame.from_items(inidf)
df

  Prod  A   1   2   3  B
0   P1  1      10  30  1
1   P2  1  40  60      2

Target df:

tgtdf = [('Prod', ['P1', 'P2']),
 ('A', ['1', '1']),
 ('1', ['10', '40']),
 ('2', ['10', '60']),
 ('3', ['30', '60']),
 ('B', ['1', '2']),             
 ]
df2 = pd.DataFrame.from_items(tgtdf)
df2

  Prod  A   1   2   3  B
0   P1  1  10  10  30  1
1   P2  1  40  60  60  2

In my example above, the columns to be targeted are Columns named 1, 2 and 3. In the first row, the first target column (named 1) has a missing value and is copied from the next populated Column in this case (named 2). In the second row, last target column (named 3) has a missing value and is copied from the previous populated Column in this case (named 2).

is necessary find rows for `fill` and for `bfill` ? Is possible if no value in `2` column sometimes is replace by `bfill` and sometimes by `ffill`? There is multiple columns? — jezrael, Aug 13 '17 at 16:08

jezrael · Answer 1 · 2017-08-13T16:21:42.663

You can use replace first for convert empty spaces to NaNs.

Then select rows for bfill and for ffill replacing with axis=1 for replace by rows:

df = df.replace('', np.nan)
bfill_rows = [0] #if necessary specify more values of index
ffill_rows = [1] #if necessary specify more values of index

df.loc[bfill_rows] = df.loc[bfill_rows].bfill(axis=1)
df.loc[ffill_rows] = df.loc[ffill_rows].ffill(axis=1)
print (df)
  Prod  A   1   2   3  B
0   P1  1  10  10  30  1
1   P2  1  40  60  60  2

If necessary is possible specify columns also:

df = df.replace('', np.nan)
cols = ['1','2','3']
bfill_rows = [0]
ffill_rows = [1]

df.loc[bfill_rows, cols] = df.loc[bfill_rows, cols].bfill(axis=1)
df.loc[ffill_rows, cols] = df.loc[ffill_rows, cols].ffill(axis=1)
print (df)

  Prod  A   1   2   3  B
0   P1  1  10  10  30  1
1   P2  1  40  60  60  2

hmmm, unfortunately only one answer can be accepted... :( – jezrael Aug 13 '17 at 16:59 — jezrael, Aug 13 '17 at 16:59

score 2 · Accepted Answer · answered Aug 13 '17 at 15:57

2

Replace all blanks with NaNs and first ffill and then bfill on axis=1 for columns '1','2','3'

In [31]: df[['1','2','3']] = df[['1','2','3']].replace('', np.nan).ffill(1).bfill(1)

In [32]: df
Out[32]:
  Prod  A   1   2   3  B
0   P1  1  10  10  30  1
1   P2  1  40  60  60  2

answered Aug 13 '17 at 15:57

Zero

74,117
18
147
154

Works great. Thank you, John – Jdoe Aug 13 '17 at 16:56

Alexander · Answer 3 · 2017-08-13T16:02:57.400

First of all, replace the empty quotes with NaN values. Then ffill or bfill as needed, specifying axis=0. The axis is 0 when selecting a given row because the result of such a selection is a series. If you were to select multiple rows (e.g. the entire dataframe), then the axis would be 1

df = df.replace('', np.nan)
df.iloc[0, :].bfill(axis=0, inplace=True)  # Backfill first row.
df.iloc[1, :].ffill(axis=0, inplace=True)  # Forwardfill second row.

>>> df
  Prod  A   1   2   3  B
0   P1  1  10  10  30  1
1   P2  1  40  60  60  2

Filling cell values horizontally in Pandas dataframe

3 Answers3