Conditional formatting in Plotly

Question

This question is about how to do conditional formatting in Plotly.

Instances where this might be needed:

Scatter plots where points need to be colored (i.e. rainbow) as a function of 2 variable;
Interactive charts where the coloring depends on the parameter values;
Histograms, where parts of it need to be colored differently.

Here I will ask specifically about histograms.

Take the following data:

data = np.random.normal(size=1000)

I want to have a histogram where values higher that 0 are binned under a different color.

A simple solution is to

hist1 = go.Histogram(x=data[data<0], 
                    opacity=0.75, 
                    histnorm='density',
                    showlegend=False,
                    )
hist2 = go.Histogram(x=data[data>=0], 
                    opacity=0.75, 
                    histnorm='density',
                    showlegend=False,
                    )
layout = go.Layout(barmode='overlay')
fig = go.Figure(data=[hist1, hist2], layout=layout)
iplot(fig, show_link=False)

There are several problems with this solution:

The default bin sizes are different for the 2 histograms, causing overlapping around zero.
If I want to have histnorm = 'probability density' the resulting plots "normalize" each of the separate histograms, so they will look disproportionate.
Binning starts from left for both histograms and so the last bin may go beyond for the histogram of the values below zero.

Is there a better way to do this?

UPDATE

OK, I can solve (1) and (3) using xbins:

hist1 = go.Histogram(x=data[data>=0], 
                    opacity=0.75, 
                    xbins=dict(
                        start=0,
                        end=4,
                        size=0.12),
                    histnorm='density',
                    showlegend=False,
                    )
hist2 = go.Histogram(x=data[data<0], 
                    opacity=0.75, 
                    xbins=dict(
                        start=-0.12*33,
                        end=0,
                        size=0.12),
                    histnorm='density',
                    showlegend=False,
                    )
layout = go.Layout(barmode='overlay')
fig = go.Figure(data=[hist1, hist2], layout=layout)
iplot(fig, show_link=False)

But, how do I solve the second issue?

How did my suggestion work out for you? – vestland Sep 03 '21 at 07:09 — vestland, Sep 03 '21 at 07:09

vestland · Answer 1 · 2019-12-17T13:42:07.470

For the...

If I want to have histnorm = 'probability density' the resulting plots "normalize" each of the separate histograms, so they will look disproportionate.

... part it seems you will have to normalize the entire sample before you split it in two different histograms. This means that what you should do is to make an area chart with multiple colors under a single trace. But the suggested solution to this unfortunately seems to be to assign different colors to two traces with...

df_pos = df.where(df < 0, 0)
df_neg = df.where(df > 0, 0)

... which of course brings you right back to where you are.

So in order to get what you want, it seems you'll have to free yourself from the boundaries of gi.Histogram, sort out the binning and normalization first, and then use a combination of area charts or a bar chart. To my understanding, this will take care of all three bullet points. Here's a suggestion on how to do that:

Plot:

Code:

# imports
import plotly.graph_objects as go
from plotly.offline import iplot
import pandas as pd
import numpy as np

# theme
import plotly.io as pio
#pio.templates
#pio.templates.default = "plotly_white"
pio.templates.default = "none"

# Some sample data
np.random.seed(123)
x = np.random.normal(0, 1, 1000)

# numpy binning
binned = np.histogram(x, bins=30, density=True)

# retain some info abou the binning
yvals=binned[0]
x_last = binned[1][-1]
xvals=binned[1][:-1]

# organize binned data in a pandas dataframe
df_bin=pd.DataFrame(dict(x=xvals, y=yvals))
df_bin_neg = df.where(df['x'] < 0)
df_bin_pos = df.where(df['x'] > 0)

# set up plotly figure
fig=go.Figure()

# neagtive x
fig.add_trace(go.Scatter(
    x=df_bin_neg['x'],
    y=df_bin_neg['y'],
    name="negative X",
    hoverinfo='all',
    fill='tozerox',
    #fillcolor='#ff7f0e',
    fillcolor='rgba(255, 103, 0, 0.7)',

    line=dict(color = 'rgba(0, 0, 0, 0)', shape='hvh')
))

# positive x
fig.add_trace(go.Scatter(
    x=df_bin_pos['x'],
    y=df_bin_pos['y'],
    name="positive X",
    hoverinfo='all',
    fill='tozerox',
    #opacity=0.2,
    #fillcolor='#ff7f0e',
    #fillcolor='#1f77b4',
    fillcolor='rgba(131, 149, 193, 0.9)',
    line=dict(color = 'rgba(0, 0, 0, 0)', shape='hvh')
))

# adjust layout to insure max values are included
ymax = np.max([df_bin_neg['y'].max(), df_bin_neg['y'].max()])
fig.update_layout(yaxis=dict(range=[0,ymax+0.1]))

# adjust layout to match OPs original
fig.update_xaxes(showline=True, linewidth=1, linecolor='black', mirror=False, zeroline=False, showgrid=False)
fig.update_yaxes(showline=False)#, linewidth=2, linecolor='black', mirror=True)

fig.show()

Conditional formatting in Plotly

1 Answers1