3

what is the best way for doing year over year line charts with daily data in bokeh?

currently im adding a dateline (arbitrarily for 2016) and year column to inital dataframe of daily values. Then pivoting to wide data by year filling in NAs (missing data varies across years) and then building bokeh graph line by line across the year cols:

Say I have a table of three years data:

Column: Date and Value

df = df.set_index('Date')

df['dateline'] = df.index.to_series().dt.strftime('%d-%b-2016')
df['year'] = df.index.to_series().dt.strftime('%Y')

pv = pd.pivot_table(df, index=df['dateline'], columns=df.index.year,
                    values='value', aggfunc='sum')

pv.index = pd.to_datetime(pv.index, format = '%d-%b-%Y' )
pv.sort_index(inplace=True)
pv = pv.apply(lambda x: x.fillna(method = 'ffill' , limit = 4))


p.line(x= pv.index , y = pv[2017], line_width=1.5, line_color = "red" ,legend = '2017')
p.line(x= pv.index , y = pv[2016], line_width=1.5, line_color = "blue" ,legend = '2016')
p.line(x= pv.index , y = pv[2015], line_width=1.5, line_color = "green" , legend = '2015')
p.line(x= pv.index , y = pv[2014], line_width=1.5, line_color = "orange" ,legend = '2014')

Question i have is can this be further optimized? I would like to use hover in the future so what would be the best set up? Next step would be loops over years column but do I need to go that route?

Coming from R I would like to keep data in long format and do something like:

p.line(df, x='dateline' , y = 'value' , color = 'year')

Thanks for the tips.

1 Answers1

1

One solution is to take your dates and create a year column and a day of year column using the .dt accessors

Be sure that df['date'] is a datetime column.

df['year'] = df['date'].dt.year
df['dayofyear'] = df['date'].dt.dayofyear

df.head()

            year     value  dayofyear
date                                 
2014-01-31  2014  1.964372         31
2014-02-28  2014  2.386228         59
2014-03-31  2014  2.695743         90
2014-04-30  2014  2.712133        120
2014-05-31  2014  2.033271        150


from bokeh.charts import Line
p = Line(df,x='dayofyear', y='value',color='year')
show(p)

enter image description here

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • great thank you. Would it be possible then to format the x axis from dayofyear to '%d-%b' (day, month) and make the same format available to the hover tool? – python_analysis Apr 18 '17 at 19:10
  • Yes, you should be able to label those ticks as you see fit. If you don't mind and you found this helpful, would you [accept](http://stackoverflow.com/help/someone-answers) this answer. – Scott Boston Apr 18 '17 at 19:13
  • @python_analysis see this SO [post](http://stackoverflow.com/questions/37173230/how-do-i-use-custom-labels-for-ticks-in-bokeh). – Scott Boston Apr 18 '17 at 19:14
  • after accepting the answer I see the following error: line() got multiple values for argument 'x'. The same error I got when I tried the same solution as yours but with dateline (%d-%b) instead of dayofyear. Is this because I dont have the same number of values for each group? In which case do I have to go the pivot table route and plot across the columns – python_analysis Apr 18 '17 at 20:03
  • sorry dateline being (%d-%b-2016) instead of dayofyear – python_analysis Apr 18 '17 at 20:12
  • ill start a new question – python_analysis Apr 18 '17 at 20:22