0

I'm trying to fill NAs by using the previous Total and then subtracting the Change

            Change Total
01/01/2021  -12 100
02/01/2021  -54 154
03/01/2021  -23 177
04/01/2021  -2  NaN
05/01/2021  -54 NaN
06/01/2021  -72 NaN

Desired output;

            Change Total
01/01/2021  -12 100
02/01/2021  -54 154
03/01/2021  -23 177
04/01/2021  -2  179
05/01/2021  -54 233
06/01/2021  -72 305

I've attempted various ways of manipulating the ffill() but with no success;

df['Total'] = df['Total'].fillna(method = 'ffill' - df['Change'])

Is there a better way to attempt this? Any help much appreciated!

spcol
  • 437
  • 4
  • 15

4 Answers4

3

The expected output looks like ffill minus cumsum. Forward fill the last valid value then subtract the cumulative total at each NaN row:

# Select NaN rows
m = df['Total'].isna()
# Update NaN rows with the last valid value minus the current total Change
df.loc[m, 'Total'] = df['Total'].ffill() - df.loc[m, 'Change'].cumsum()

df:

            Change  Total
01/01/2021     -12  100.0
02/01/2021     -54  154.0
03/01/2021     -23  177.0
04/01/2021      -2  179.0
05/01/2021     -54  233.0
06/01/2021     -72  305.0
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
2

You can use pd.Series.where:

df["Total"] = df["Total"].where(df["Total"].notnull(),
                                df['Total'].ffill() - df.loc[df['Total'].isnull(), 'Change'].cumsum())

print (df)

            Change  Total
01/01/2021     -12  100.0
02/01/2021     -54  154.0
03/01/2021     -23  177.0
04/01/2021      -2  179.0
05/01/2021     -54  233.0
06/01/2021     -72  305.0
Henry Yik
  • 22,275
  • 4
  • 18
  • 40
2

Let us try combine_first

df = df.combine_first(df[['Total']].ffill().sub(df.loc[df.Total.isnull(),'Change'].cumsum(),axis=0))
df
            Change  Total
01/01/2021     -12  100.0
02/01/2021     -54  154.0
03/01/2021     -23  177.0
04/01/2021      -2  179.0
05/01/2021     -54  233.0
06/01/2021     -72  305.0
BENY
  • 317,841
  • 20
  • 164
  • 234
1

You can use np.where and fillna() as follows: :

change_when_total_null = df.loc[df['Total'].isnull(), 'Change']
df['Total'] = np.where(df['Total'].isnull(),df['Total'].fillna(method='ffill') -change_when_total_null.cumsum(),df['Total'])

prints:

            Change  Total
01/01/2021     -12  100.0
02/01/2021     -54  154.0
03/01/2021     -23  177.0
04/01/2021      -2  179.0
05/01/2021     -54  233.0
06/01/2021     -72  305.0
sophocles
  • 13,593
  • 3
  • 14
  • 33