I found it really hard to articulate this problem, but it is a similar problem to this one.
For example, let's say I have a table like this:
Label | Value |
---|---|
A | 1 |
B | 2 |
C | Nan |
D | Nan |
E | Nan |
F | 12 |
It needs to look like this:
Label | Value |
---|---|
A | 1 |
B | 2 |
C | 3 |
D | 3 |
E | 3 |
F | 3 |
Used the next available value (12) and divided it by the number of Nan values + 1 (12/4) = 3 and replaced the Nan values AND the original value used to interpolate(12) by 3. It is similar to the previous question but also modifies the original value used for interpolation.
test = pd.DataFrame({'Label': ['A', 'B', 'C', 'D', 'E', 'F','G','H','I'],
'Value': [1, 2, None, None, None, 12,None,None,4]})
test['break'] = np.where(test['Value'].notnull(),1,0)
test['group'] = test['break'].shift().fillna(0).cumsum()
test['Value2'] = test.groupby('group').Value.apply(lambda x: x.fillna( x.max() / len(x)))
for row in range(0,test.shape[0]):
if test['break'].iloc[row] == 0 and test['break'].iloc[row+1] == 1:
test.at[row+1, 'Value2'] = test['Value2'].iloc[row]
df.interpolate() is not capable of doing this, and this is what I have so far. It gets the job done, but is not very elegant