0

For a scatterplot with datashader I want to incorporate the notion of time into the plot. Potentially by using color.

Currently,

import numpy as np
import pandas as pd
import seaborn as sns

date_values = ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04']
result = []
for d in date_values:
    print(d)
    df = pd.DataFrame(np.random.randn(10000, 2), columns=list('AB'))
    df.columns = ['value_foo', 'value_bar']
    df['dt'] = d
    df['dt'] = pd.to_datetime(df['dt'])
    result.append(df)

df =  pd.concat(result)    
display(df.head())

import holoviews as hv
import holoviews.operation.datashader as hd
hv.extension("bokeh", "matplotlib") 

import datashader as ds
import datashader.transfer_functions as tf


cvs = ds.Canvas().points(df, 'value_foo', 'value_bar')
from colorcet import fire
#tf.set_background(tf.shade(cvs, cmap=fire),"black")
tf.shade(cvs)

#sns.jointplot(x="value_foo", y="value_bar", data=df, hue='dt')

Gives enter image description here

However now the different dates are not distinguishable. How can I include the date information (for example using color) when plotting?

Georg Heiler
  • 16,916
  • 36
  • 162
  • 292

1 Answers1

1

Datashader can colorize using any categorical column. Here, you have only four distinct dates, which already works as a categorical, but if you have a lot of dates, you'll first want to bin them into a suitable set of date ranges (e.g. less than 256 total values, if you use a 256-color colormap).

Either way, once you have a categorical column c, pass agg=ds.count_cat('c') to your .points() call, and you should get a plot colorized by date.

See the 'pickup_hour' plot in https://examples.pyviz.org/nyc_taxi/ for examples.

James A. Bednar
  • 3,195
  • 1
  • 9
  • 13
  • `input must be categorical` - even when using string as datatype. But when using pandas categorical it fails as well. – Georg Heiler Apr 20 '20 at 09:02
  • Not sure what you mean by failing in this case. Yes, you do need `.astype("category")` on that column, and probably a better example is "aggc" at https://datashader.org/getting_started/Pipeline.html. If you can run that example successfully, then presumably you can make the above example work, as it's quite similar. – James A. Bednar Apr 20 '20 at 18:54
  • Ok - one step further. Now I need to fix: ` Insufficient colors provided (22) for the categorical fields available (63)` – Georg Heiler Apr 20 '20 at 19:02
  • https://github.com/holoviz/datashader/issues/767 is getting this solved - sort of. But now a nice legend is missing which explains the mapping of category and color. – Georg Heiler Apr 20 '20 at 19:14
  • 1
    We're working to have proper support for color keys like that for Bokeh-rendered Datashader categorical plots in HoloViews, but it will probably be weeks or months before we get to that, as we have a lot of other similar but easier tasks to work on first. Meanwhile, http://holoviews.org/user_guide/Large_Data.html shows how to fake a legend like that. – James A. Bednar Apr 22 '20 at 01:03