I want to select data from some pandas DataFrame in a Jupyter-notebook through a SelectionRangeSlider
and plot the filtered data using holoviews bar chart.
Consider the following example:
import numpy as np
import pandas as pd
import datetime
import holoviews as hv
hv.extension('bokeh')
import ipywidgets as widgets
start = int(datetime.datetime(2017,1,1).strftime("%s"))
end = int(datetime.datetime(2017,12,31).strftime("%s"))
size = 100
rints = np.random.randint(start, end + 1, size = size)
df = pd.DataFrame(rints, columns = ['zeit'])
df["bytes"] = np.random.randint(5,20,size=size)
df['who']= np.random.choice(['John', 'Paul', 'George', 'Ringo'], len(df))
df["zeit"] = pd.to_datetime(df["zeit"], unit='s')
df.zeit = df.zeit.dt.date
df.sort_values('zeit', inplace = True)
df = df.reset_index(drop=True)
df.head(2)
This gives the test DataFrame df
:
Let's group the data:
data = pd.DataFrame(df.groupby('who')['bytes'].sum())
data.reset_index(level=0, inplace=True)
data.sort_values(by="bytes", inplace=True)
data.head(2)
Now, create the SelectionRangeSlider
that is to be used to filter and update the barchart.
%%opts Bars [width=800 height=400 tools=['hover']]
def view2(v):
x = df[(df.zeit > r2.value[0].date()) & (df.zeit < r2.value[1].date())]
data = pd.DataFrame(x.groupby('who')['bytes'].sum())
data.sort_values(by="bytes", inplace=True)
data.reset_index(inplace=True)
display(hv.Bars(data, kdims=['who'], vdims=['bytes']))
r2 = widgets.SelectionRangeSlider(options = options, index = index, description = 'Test')
widgets.interactive(view2, v=r2)
(I have already created an issue on github for the slider not displaying the label correctly, https://github.com/jupyter-widgets/ipywidgets/issues/1759)
Problems that persist:
the image width and size collapse to default after first update (is there a way to give
%%opts
as argument tohv.Bars
?)the y-Scale should remain constant (i.e. from 0 to 150 for all updates)
is there any optimization possible concerning speed of updates?
Thanks for any help.