0

I have a pandas dataframe with two columns of time series data. In my actual data, these columns are large enough that the render is unwieldy without datashader. I am attempting to compare events from these two timeseries. However, I need to be able to tell which data point is from which column. A simple functional example is below. How would I get columns A and B to use different color maps?

import numpy as np
import hvplot.pandas
import pandas as pd

A = np.random.randint(10, size=10000)
B = np.random.randint(30, size=10000)
d = {'A':A,'B':B}
df = pd.DataFrame(d)

df.hvplot(kind='scatter',datashade=True, height=500, width=1000, dynspread=False)
Jeff
  • 377
  • 4
  • 14

1 Answers1

2

You will have to use the count_cat aggregator that counts each category separately, e.g. in the example above that would look like this:

import datashader as ds
df.hvplot(kind='scatter', aggregator=ds.count_cat('Variable'), datashade=True,
          height=500, width=1000)

The 'Variable' here corresponds to the default group_label that hvplot assigns to the columns. If you provided a different group_label you would have to update the aggregator to match. However instead of supplying an aggregator explicitly you can also use the by keyword:

df.hvplot(kind='scatter', by='Variable', datashade=True,
          height=500, width=1000)

Once hvplot 0.3.1 is released you'll also be able to supply an explicit cmap, e.g.:

df.hvplot(kind='scatter', by='Variable', datashade=True,
          height=500, width=1000, cmap={'A': 'red', 'B': 'blue'})

enter image description here

philippjfr
  • 3,997
  • 14
  • 15
  • This worked great - thanks. I have a follow on question, sorry if this expands the scope a little bit. Since the solution doesn't allow for manually assigning colors, on my working dataset, I can't tell which set is the blue, and which set is the red. I tried adding a "legend=True" parameter inside the df.hvplot command, but no legend appears. Do you know of a way to either manually specify the colors for columns, or get a legend to show up? – Jeff Jan 11 '19 at 21:24
  • 1
    Automatic legends for datashaded plots are high on our to do list. In holoviews it is also possible to pass a dictionary as the `cmap` but that is not currently supported in hvplot. I'll add support for that in the coming days and push a minor release out, but you'd manually have to add a legend by overlaying some dummy points, e.g. ``hv.Points([(0, 0, 'Variable1'), (0, 0, 'Variable2'), ...], vdims='Variable').opts(color='Variable', cmap={'Variable1': 'red', ...})``. – philippjfr Jan 13 '19 at 03:50
  • 1
    I've now amended the answer for a simplified signature and to note the support for an explicit cmap which will be added in hvplot 0.3.1 due to be released today. – philippjfr Jan 14 '19 at 15:58