I am using an embedded bokeh app in jupyter to label sections of time series. Lets say we have to following example dataframe:
Time Y Label
0 2018-02-13 13:14:05 0.401028 a
1 2018-02-13 13:30:46 0.900101 a
2 2018-02-13 13:40:06 -0.648143 a
3 2018-02-14 16:33:27 1.111675 a
4 2018-03-13 11:43:16 -0.986025 a
where Time is datetime64[ns], Y is float64 and Label is from type object.
Now I use the following bokeh app to change the entries of Label by using a user input and trigger the callback by a button click.
def modify_doc(doc):
p = figure(tools=["pan, box_zoom, wheel_zoom, reset, save, xbox_select"])
source=ColumnDataSource(df_test)
p.line(x="index", y="Y", source=source)
p.circle(x="index", y="Y", source=source, alpha=0)
def callback():
global list_new
list_new = []
inds = source.selected.indices
for j in inds:
source.data["Label"][j] = label_input.value.strip()
list_new.append(pd.DataFrame(source.data))
label_input = TextInput(title="Label")
button = Button(label="Label Data")
button.on_click(callback)
layout = column(p, label_input, button)
doc.add_root(layout)
show(modify_doc)
Do not wonder about list_new, it is a needed approach as I use multiple time series plots and ColumnDataSource objects.
After the callback I get the accepted Label output:
Label Time Y index
0 a 1.518528e+12 0.401028 0
1 a 1.518529e+12 0.900101 1
2 b 1.518529e+12 -0.648143 2
3 b 1.518626e+12 1.111675 3
4 b 1.520941e+12 -0.986025 4
But why does Time get converted to float? I know how to reconstruct the timestamps by using datetime.datetime.utcfromtimestamp() or matching the indices but how can I change the callback to keep the original entries in Time?