6

I have a DataFrame:

df = pd.DataFrame(data=[676, 0, 670, 0, 668], index=['2012-01-31 00:00:00','2012-02-29 00:00:00',
                                                     '2012-03-31 00:00:00','2012-04-30 00:00:00',
                                                     '2012-05-31 00:00:00'])  
df.index.name = "Date"
df.columns = ["Number"]

Which looks like:

              Number
Date    
2012-01-31 00:00:00 676
2012-02-29 00:00:00 0
2012-03-31 00:00:00 670
2012-04-30 00:00:00 0
2012-05-31 00:00:00 668

How can i input 2nd and 4th values with (676+670)/2 and (670+668)/2 correspondinly?

I can save values as np.array and imput them in array, but that's rediculous!

Ali
  • 464
  • 1
  • 3
  • 17
Ladenkov Vladislav
  • 1,247
  • 2
  • 21
  • 45

4 Answers4

9

I use where method and specify to replace any 0 with np.nan. Once we have specified 0 to be NaN we can use fillna method. By using ffill and bfill we fill all NaN with the corresponding previous and proceeding values, add them, and divide by 2.

df.where(df.replace(to_replace=0, value=np.nan),
 other=(df.fillna(method='ffill') + df.fillna(method='bfill'))/2)

                     Number
Date                       
2012-01-31 00:00:00   676.0
2012-02-29 00:00:00   673.0
2012-03-31 00:00:00   670.0
2012-04-30 00:00:00   669.0
2012-05-31 00:00:00   668.0
spies006
  • 2,867
  • 2
  • 19
  • 28
2
#use apply to fill the Number with average from surrounding rows.
df['Number'] = df.reset_index().apply(lambda x: df.reset_index()\
                               .iloc[[x.name-1,x.name+1]]['Number'].mean() \
                               if (x.name>0) & (x.Number==0) else x.Number,axis=1).values

df
Out[1440]: 
                     Number
Date                       
2012-01-31 00:00:00   676.0
2012-02-29 00:00:00   673.0
2012-03-31 00:00:00   670.0
2012-04-30 00:00:00   669.0
2012-05-31 00:00:00   668.0
Allen Qin
  • 19,507
  • 8
  • 51
  • 67
2

@spies006 answer can be adapted to:

df.where(df.replace(to_replace=0, value=np.nan).isna(), other=(df.fillna(method='ffill') + df.fillna(method='bfill'))/2)

It can be simplified to this:

df.where(df.values == 0, other=(df.fillna(method='ffill') + df.fillna(method='bfill'))/2)
trincot
  • 317,000
  • 35
  • 244
  • 286
1

I would use the replace() method to replace the 0's with np.nan and then use the method interpolate():

df.replace(to_replace=0, value=np.nan, inplace=True)
df.interpolate(inplace=True)

                     Number
Date                       
2012-01-31 00:00:00   676.0
2012-02-29 00:00:00   673.0
2012-03-31 00:00:00   670.0
2012-04-30 00:00:00   669.0
2012-05-31 00:00:00   668.0
matthme
  • 253
  • 2
  • 7