4

I need to replace "0" row data in pandas with the previous rows non-zero value IF and ONLY IF, the value in the row following the "0" is non zero.

I.e.

101
92
78
0
107
0
0

would become:

101
92
78
78
107
0
0

Any ideas how to do this would be much appreciated :-)

Thanks!

cs95
  • 379,657
  • 97
  • 704
  • 746
cwse
  • 584
  • 2
  • 10
  • 20

2 Answers2

4

using shift you could do

In [608]: df.loc[(df.val == 0) & (df.val.shift(-1) != 0), 'val'] = df.val.shift(1)

In [609]: df
Out[609]:
     val
0  101.0
1   92.0
2   78.0
3   78.0
4  107.0
5    0.0
6    0.0
Zero
  • 74,117
  • 18
  • 147
  • 154
  • The last row of `df.val.shift(-1) != 0` is always True because `NaN != 0`. That might be problematic if the 0 in the last row is preceded by a non-zero value (and assuming the OP does not wish the 0 in the last row to be forward-filled in this case). Using `(df['val'] != 0).shift(-1) == True` instead of `df.val.shift(-1) != 0` is a workaround. – unutbu Jul 13 '17 at 12:46
1

This is answer is similar to JohnGalt but it faster when compared:

In [12]: np.where((df.Val.values==0)&(df.Val.shift(-1)!=0),df.Val.shift(),df.Val)
Out[31]: array([ 101.,   92.,   78.,   78.,  107.,    0.,    0.])

In [24]: %timeit np.where((df.Val.values==0)&(df.Val.shift(-1)!=0),df.Val.shift(),df.Val)
1000 loops, best of 3: 671 µs per loop

In [25]: %timeit df.loc[(df.Val == 0) & (df.Val.shift(-1) != 0), 'val'] = df.Val.shift(1)
100 loops, best of 3: 2.01 ms per loop
shivsn
  • 7,680
  • 1
  • 26
  • 33