0

I'm following this great tutorial to play a little bit with Bokeh.

Basically, I have a figure with two independent line added to it. Everything is rendered properly but when I want to update nothing happens even if I checked that the new ColumnDataSource is well updated with the new values.

I render it using the command : bokeh serve --show my_app

Here is how I create my figure :

src_p6 = make_dataset(["select_a", "select_b"])
p6 = make_plot(src_p6)
select_selection = CheckboxGroup(labels=["select_a", "select_b"], active = [0, 1])
select_selection.on_change('active', update)
controls = WidgetBox(select_selection)
curdoc().add_root(column(controls, p6, width=1200))

def make_dataset(select_list):
  if 'select_a' in select_list and 'select_b' in select_list:
    tmp = pd.DataFrame({'time': df["time"], 
                       'a': df["a"], 
                       'b': df["b"]
                       })
  elif 'select_a' in select_list and 'select_b' not in select_list:
    tmp = pd.DataFrame({'time': df["time"], 
                       'a': df["a"]
                       })
  elif 'select_a' not in select_list and 'select_b' in select_list:
    tmp = pd.DataFrame({'time': df["time"], 
                       'b': df["b"]
                       })
  else:
    tmp = pd.DataFrame({'time': df["time"]
                       })

  src = ColumnDataSource(tmp)

  return src

def make_plot(plot_src):
  p = figure(plot_width=1000, plot_height=600, 
           title="Line x2 with hover and update",
           x_axis_label='Time', 
           y_axis_label='Values'
          )

  hover_content = [("Time", "@time")]

  if 'a' in plot_src.data:
    p.line(x='time', y='a', source=plot_src, legend="A", line_color="blue")
    hover_content.append(("A", "@a"))
  if 'b' in plot_src.data:
    p.line(x='time', y='b', source=plot_src, legend="B", line_color="red")
    hover_content.append(("B", "@b"))

  p.add_tools(HoverTool(tooltips=hover_content))

  return p

def update(attr, old, new):
  print(src_p6.data)

  select_to_plot = [select_selection.labels[i] for i in select_selection.active]

  new_src = make_dataset(select_to_plot)

  src_p6.data = new_src.data

  print("**********************")
  print(src_p6.data) # I see here that the data are well updated compared to the first print

My incoming data is JSON and looks like this :

# {"data":[{"time":0,"a":123,"b":123},{"time":1,"a":456,"b":456},{"time":2,"a":789,"b":789}]}
# data = json.load(data_file, encoding='utf-8')
# df = pd.io.json.json_normalize(data['data'])

Thank you for your insights

ZazOufUmI
  • 3,212
  • 6
  • 37
  • 67

2 Answers2

2

This will not function correctly:

src_p6.data = new_src.data

The ColumnDataSource is one of the most complicated objects in Bokeh, e.g. the .data object on a CDS is not a plain Python dict, it has lots of special instrumentation to make things like efficient streaming possible. But it is also tied to the CDS it is created on. Ripping the .data out of one CDS and putting assigning it to another is not going to work. We probably need to find a way to make that complain, I am just not sure how, offhand.

In any case, you need to assign .data from a plain Python dict, as all the examples and demonstrations do:

src_p6.data = dict(...)

For your sepcific code, that probably means having make_dataset just return the dicts it creates directly, instead of putting them in dataframes then making a CDS out of that.

bigreddot
  • 33,642
  • 5
  • 69
  • 122
  • So this is creating a DataFrame with pandas that is messing everything ? Regarding my JSON input (array of object), what should be the input for the CDS ? Sorry I'm pretty new with Python – ZazOufUmI Jul 23 '18 at 16:27
  • No it's not the pandas part that is a problem, making a DataFrame in this case is probably unnecessary but harmless. The problem is making one CDS, then copying its `.data` to a different CDS. – bigreddot Jul 23 '18 at 16:31
  • Ok ! So now `make_dataset` returns the variable `tmp`. I create the `src_p6` like this : `src_p6 = ColumnDataSource(make_dataset(.....))`. For `update` I have now : `src_p6.data = new_src` But when I try to update, I got the error message : `ValueError('expected an element of ColumnData(String, Seq(Any)), got it_looks_like_the_dict_formatted_into_string` – ZazOufUmI Jul 23 '18 at 16:39
  • `.data` needs to be assigned from a plain Python dict, you can return the dicts you are already creating, or you can return DataFrames, but if you do the latter, you need to convert back to a dict first before you assign to `.data`. – bigreddot Jul 23 '18 at 16:43
  • Ok I get the idea now. The last thing is, I don't find how should be formatted the `dict` I send back to `.data` for update. I tried a lot of possibilities without success so far. e.g `src_p6.data = new_src.to_dict('list')` or `src_p6.data = new_src.to_dict()` or `src_p6.data = {"time": 0, "a": 1}` and more without understanding. I'm sorry if I'm missing something easy. Thank you very much for your help and your time. – ZazOufUmI Jul 23 '18 at 17:17
  • 1
    The dict should map string keys to columns of equal length. Columns can be: Python lists, NumPy Arrays, Pandas Series... most any concrete iterable sequence type will do. The dicts you are passing to `pd.DataFrame` are already the right format. As I said, you could just return those directly. – bigreddot Jul 23 '18 at 17:24
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/176587/discussion-between-stephane-j-and-bigreddot). – ZazOufUmI Jul 23 '18 at 17:37
  • 1
    I don't use SO chat.I am already over-taxed monitoring the mailing list, GH, Gitter, and SO questions. Please feel free to come by the Bokeh public mailing list: https://groups.google.com/a/continuum.io/forum/#!forum/bokeh – bigreddot Jul 23 '18 at 17:52
0

First of all thanks to @bigreddot for his time and guidance.

One of my biggest problem was that I didn't actually wanted to update the values but rather just show/hide it, so just removing it from the source was not working.

Adding line in my make_plot function with a if statement doesn't work also because it is only called the first time the plot is created. For updates, it update the value on the figure but do not reconstruct everything from scratch... So if you start with only one line, I don't know how it will create a new line, if it's even possible...


I started to simplify my make_dataset function to only return a simple Python dict :

tmp = dict(time=df["time"], a=df["a"], b=df["b"])

But when I wanted to remove a line, I used an empty array even if there is better solutions (I was just playing with Bokeh here) : Line ON/OFF, Interactive legend

empty = np.empty(len(df["time"])); empty.fill(None)
tmp = dict(time=df["time"], a=df["a"], b=empty)

When I first create my plot I then do :

src_p6 = ColumnDataSource(data=make_dataset(["select_a", "select_b"]))
p6 = make_plot(src_p6)

And the update function update the .data of the ColumnDataSource with a basic Python dict :

new_src = make_dataset(select_to_plot)
src_p6.data = new_src
ZazOufUmI
  • 3,212
  • 6
  • 37
  • 67