I'm working with data that has null values in between. I intend to construct a time-series plot out of cumulative sum of a particular column sales
. Conditions for cumulative sum on sales
: (1.) if first row is null, fillna(0)
, then cumsum()
so plot can always start from origin. (2.) if null rows follow each other to the end, leave as null else fillna(0)
:
data = {'year': [2010, 2011, 2012, 2013, 2014, 2015, 2016,2017, 2018, 2019],
'quantity': [10, 21, 20, 10, 39, 30, 31,45, 23, 56],
'sales': [None, 41, None, None, 32, 0, 31,None, None, None]}
df = pd.DataFrame(data)
df = df.set_index('year')
df['cum_sales'] = df[['sales']].cumsum()
print df
df.plot()
How to apply conditions such that result becomes: