0

I have the following code to produce a heatmap using pandas and holoviews:

cols = ['source','sink','net','avg']
data = [['13002','13002',5.0,2.161478e+06],
['13002','13003',5.0,6.959788e+04],
['13002','23002',5.0,4.233500e+03],
['13002','33006',5.0,8.104000e+03],
['13002','43002',5.0,9.374625e+05],
['13002','43004',5.0,2.865538e+03],
['13002','53001',5.0,1.737890e+05],
['13002','53008',5.0,3.693100e+04],
['13002','53017',5.0,4.541660e+05],
['13002','unk',23.0,1.205498e+05],
['13003','13002',23.0,2.275744e+05],
['13003','43002',23.0,3.250252e+05],
['13003','43003',23.0,4.248433e+04],
['13003','43008',23.0,7.541023e+04],
['13003','53012',23.0,5.000000e+02],
['13003','unk',23.0,5.247462e+03],
['13005','43004',23.0,2.355648e+05],
['23002','13002',23.0,1.317475e+05],
['23002','13003',23.0,1.000000e+04],
['23002','53008',23.0,4.716667e+03]]

df = pd.DataFrame(data, columns=cols)

hm = hv.HeatMap(data, kdims = ['source','sink']
                , vdims =['net', 'avg']).sort(['sink','source'])
layout = hv.Layout([hm])
layout.opts(
    opts.HeatMap(xticks=None, tools=['hover'], xrotation=90)
)

It produces the following:

enter image description here

Note that the x-axis ('source') is not ordered properly. I tried using 'sort()' but it seems to only sort one axis or the other. How can I make both axes be properly sorted for the holoviews heatmap?

BEST WORK AROUND -

So far I can get around it by doing as follows:

df = pd.DataFrame(data, columns=cols)
temp = pd.Series(df.sink.unique(),name='sink').sort_values()
df = df.groupby('source').apply(lambda x: x.merge(temp, how='outer', on='sink'))
df.source = df.source.ffill()
df = df.fillna(0).droplevel([0])

hm = hv.HeatMap(df, kdims = ['source', 'sink']
                , vdims =['net', 'avg']).sort()
layout = hv.Layout([hm])
layout.opts(
    opts.HeatMap(xticks=None, tools=['hover'], xrotation=90)
)
MikeB2019x
  • 823
  • 8
  • 23

3 Answers3

1

To achieve this type of independent sorting, you'll need to specify the order manually. you can do this either beforehand by defining a dimension, or redim on plot creation to set the values.

Defining Dimension beforehand:

# np.unique sorts the unique values by default
source = hv.Dimension("source", values=np.unique(df["source"]))
sink = hv.Dimension("sink", values=np.unique(df["sink"]))

(hv.HeatMap(df, kdims = [source, sink], vdims =['net', 'avg'])
 .opts(xticks=None, tools=['hover'], xrotation=90)
)

Using redim.values to set the Dimension values afterwards

(hv.HeatMap(data, kdims = ["source", "sink"], vdims =['net', 'avg'])
 .opts(xticks=None, tools=['hover'], xrotation=90)
 .redim.values(
     sink=np.unique(df["sink"]),
     source=np.unique(df["source"]))
)

In either case you end up with a plot looking like this: enter image description here

Cameron Riddell
  • 10,942
  • 9
  • 19
0

A more appropriate solution is to used the solution by @Riddell but for the re-dimension use:

.redim.values(x=temp['x'].sort_values(), y=temp['y'].sort_values())
MikeB2019x
  • 823
  • 8
  • 23
0

For the sake of completion, I would strongly advise to use hooks to modify the bokeh's Figure x_range parameter.

cols = ['source','sink','net','avg']
data = [['13002','13002',5.0,2.161478e+06],
['13002','13003',5.0,6.959788e+04],
['13002','23002',5.0,4.233500e+03],
['13002','33006',5.0,8.104000e+03],
['13002','43002',5.0,9.374625e+05],
['13002','43004',5.0,2.865538e+03],
['13002','53001',5.0,1.737890e+05],
['13002','53008',5.0,3.693100e+04],
['13002','53017',5.0,4.541660e+05],
['13002','unk',23.0,1.205498e+05],
['13003','13002',23.0,2.275744e+05],
['13003','43002',23.0,3.250252e+05],
['13003','43003',23.0,4.248433e+04],
['13003','43008',23.0,7.541023e+04],
['13003','53012',23.0,5.000000e+02],
['13003','unk',23.0,5.247462e+03],
['13005','43004',23.0,2.355648e+05],
['23002','13002',23.0,1.317475e+05],
['23002','13003',23.0,1.000000e+04],
['23002','53008',23.0,4.716667e+03]]

df = pd.DataFrame(data, columns=cols)

def hook(plot, element):
    plot.handles['x_range'].factors = sorted(df['source'].unique())
hm = hv.HeatMap(data, kdims = ['source','sink']
                , vdims =['net', 'avg']).opts(hooks=[hook])
layout = hv.Layout([hm])

layout.opts(
    hv.opts.HeatMap(xticks=None, tools=['hover'], xrotation=90)
)

Depending on the size of your data, modifying bokeh x_range may be much quicker than handling pandas operations. And the code is a bit clearer adding only two lines.

The downside is in case you want to use an other renderer like matplotlib.

hyamanieu
  • 1,055
  • 9
  • 25