2

My aim is to plot timeseries data by utilizing datashader and make it interactive using bokeh or datashader itself if it's possible to do.

I tried to follow this tutorial - http://datashader.org/user_guide/3_Timeseries.html and came up with the graph which is shown at the very end of the url page.

Below is the code:

n = 100000
points = 10
data = np.random.normal(0, 100, size = (n, points))
df = ds.utils.dataframe_from_multiple_sequences(np.arange(points), data)
cvs = ds.Canvas(plot_height=400, plot_width=1000)
agg = cvs.line(df, 'x', 'y', ds.count())   
img = tf.shade(agg, how='eq_hist')

An image object is formed in the above code, now how do i make use of this img object to make it an interactive graph (using bokeh or datashader) with shows x and y axis, show details of every point on hovering on the graph and comes with zooming in and zoom out capabilities.

Also, the dataframe above has multiple columns, but in order to do the plot, the multiple columns are added as rows in the dataframe separated by a NA row in the dataframe (as indicated the code above). Is it possible if i can plot the columns in different colors in the interactive graph so that it's easy to distinguish the column in datashader graph.

Please help.

zubug55
  • 729
  • 7
  • 27

1 Answers1

1

It's easy to make an interactive Bokeh plot out of that using HoloViews:

import datashader as ds, numpy as np, holoviews as hv
from holoviews.operation.datashader import datashade

n = 100000
points = 10
data = np.random.normal(0, 100, size = (n, points))
df = ds.utils.dataframe_from_multiple_sequences(np.arange(points), data)

hv.extension("bokeh")
datashade(hv.Curve(df)).options(width=1000)

Unzoomed

Zoomed

For the coloring, how many different colors do you need? 100,000 colors can't be distinguished by humans, but I have work in progress at https://github.com/pyviz/colorcet/issues/11 to get at least a few hundred distinguishable colors. If you only need a few dozen (e.g. to color by category) the existing color cycles should work fine. The data would somehow need to indicate the category first...

James A. Bednar
  • 3,195
  • 1
  • 9
  • 13
  • Thanks, can you tell how to explicitly specify colors i want to use in the above graph? Also, i want to see the point values when i hover on the above graph, how to do that? Also, is it possible if i can get to enable disable the lines(based on every column) i see in the graph above; in order to view the trend individually or all at once? for eg : the way its done in plotly https://plot.ly/python/time-series/ – zubug55 Oct 11 '18 at 01:56
  • I saw this http://holoviews.org/user_guide/Plotting_with_Bokeh.html for hovering and legends enable/disable thing, but how do i do this for above type of dataframe in which multiple columns dataframe is converted into single column dataframe separated by a row of NAN? – zubug55 Oct 11 '18 at 02:32
  • Also, the graph which u have shown above,how are you able to distinguish different lines, i can only see 2 colors – zubug55 Oct 11 '18 at 02:34
  • I'm not sure what you're after here. Datashader is needed when you have many different lines, e.g. 100,000 as above. When you have 100,000 lines, there wouldn't be any meaningful way for the user to select between individual lines. If you have only a few lines, just use HoloViews or Bokeh or Plotly directly; you won't need Datashader and you can select lines and see their values on hovering. – James A. Bednar Oct 11 '18 at 17:20
  • Datashader turns your lines into pixels, and so hovering (like the colors here) will reveal what's happening at each pixel -- the dark blue pixels have two lines crossing in that pixel, the light blue ones have one line crossing that pixel, and the white ones have no lines crossing that pixel. – James A. Bednar Oct 11 '18 at 17:20
  • So Datashader is complementary to what you can get with other plotting programs -- lots of information about what happens per pixel, at the expense of information per line. You can also mix the two -- show a datashaded image behind a HoloViews plot of a few dozen lines; those few lines will be fully hoverable and selectable, and you can see how they relate to the whole population of lines that's represented using Datashader. – James A. Bednar Oct 11 '18 at 17:20
  • So, I have a dataframe with multiple columns(columns can be as many as possible) for which i want to draw a graph using datashader, in this link http://datashader.org/user_guide/3_Timeseries.html it's mentioned - "Datashader can render arbitrarily many separate curves, limited only by what you can fit into a Dask dataframe. Instead of having a dataframe with one column per curve, you would instead use a single column for 'x' and one for 'y', with an extra row containing a NaN value to separate each curve from its neighbor(so that no line will connect between them). So can plot million curve.. – zubug55 Oct 11 '18 at 21:49
  • so how datashader is different in this case from plotting directly using plotly/bokeh/holoview etc?? – zubug55 Oct 11 '18 at 21:51
  • you also mentioned - datashader converts my 10k columns into 10k lines which gets converted into pixels. so, dark blue means 2 lines crossing over, light blue one line etc, in this case how do we know how many lines are actually under dark blue and so on? – zubug55 Oct 11 '18 at 22:04
  • can you please give an example of how to mix the two - datashade behind holoview thing you meantioned. I'm really keen to draw graphs using datashader. I have very large timeseries data with many columns in it, I want to draw fully interactive graph out of it. – zubug55 Oct 11 '18 at 22:15
  • You can add a colorbar if you use `rasterize` instead of `datashade`; see https://anaconda.org/jbednar/datashade_vs_rasterize/notebook . You can have a fully interactive graph if by that you mean that zooming and panning work, You can't select or hover on 10000 individual datashaded lines, because they have been turned into pixels before your browser sees them. plotly/bokeh/holoviews will pass all 1000 lines to the browser; datashader won't. – James A. Bednar Oct 12 '18 at 22:50
  • http://holoviews.org/user_guide/Large_Data.html has an example of mixing the two. – James A. Bednar Oct 12 '18 at 22:51
  • is there any way to figure what these colors ( if there are 3/4/5 etc colors) then what information about timeseries the graph 1 represent? I want to know what light blue color is about , what dar blue color is about and a legend at side saying there are 5 colors present in this graph? – zubug55 Oct 17 '18 at 20:47
  • or in short, i just want to know which timeseries information each of the color is representing or what information does a particular color convey? – zubug55 Oct 17 '18 at 21:08
  • As of 2023-05-04 `AttributeError: module 'datashader' has no attribute 'utils'` – endolith May 04 '23 at 14:13