0

I am making a series of tiled figures, each with 4 subplots. First I make the individual figures:

def create_plot(list1, list2):
'''
Given two lists of metric values, plot corresponding distributions and
return plot object.

INPUT
2 lists
RETURN
1 plotly figure object
'''
# grab labels (TP/FP or samplename) to make column below
# uses pop() as string label needs separating from int values
label1 = list1.pop(0)
label2 = list2.pop(0)
# make combined df column
values = list1 + list2
# calculate centiles for each entry in the arrays (displayed silently)
# and turn into column for dataframe (same order as values)
centiles = list(
    calculate_centiles(list1)
    ) + list(
        calculate_centiles(list2)
        )
# make TPFP column for dataframe
TPFP = ([label1] * len(list1)) + ([label2] * len(list2))
# make dataframe
df = pd.DataFrame(
    {'values': values, 'TPFP': TPFP, 'centiles': centiles}
    )
fig = px.histogram(
    df, x='values', color='TPFP',
    hover_data=[df.columns[2]], marginal='rug', barmode='overlay'
    )
# set format for hovertext using hovertemplate (even index = histogram,
# odd index = rug)
for i, trace in enumerate(fig['data']):
    group = trace['legendgroup']
    if i % 2 == 0:
        trace['hovertemplate'] = (
            f'True/False Positive={group}<br>'
            'Bin=%{x}<br>Count=%{y}<extra></extra>'
            )
    else:
        trace['hovertemplate'] = (
            '<br>Metric value=%{x}<br>'
            'Centile=%{customdata[0]}<br><extra></extra>'
            )
return fig

Then I insert these into a multi-plot object:

def make_tiled_figure(subfigs, metric):
'''
Take list of figures (plotly plot objects) to be combined into
tiled image. Return single figure object with tiled subplots.

INPUT
1 list of plotly figure objects, 1 string
RETURN
1 plotly figure object
'''
fig = make_subplots(rows=1, cols=4, subplot_titles=[
    'SNP_HET', 'SNP_HOM', 'INDEL_HET', 'INDEL_HOM'])
# decide on position and add subfigures to plot
for i, subfig in enumerate(subfigs):
    if subfig:
        for trace in subfig.data:
            fig.add_trace(trace, row=1, col=i+1)
            fig.update_layout(hovermode='x unified')
# specify plot size and title
fig.update_layout(
    height=500, width=1800, title_text=metric, showlegend=False
    )
return fig

When I open the figures individually (fig.show()), the hover text works perfectly, showing all the info specified (including the counts for each histogram bin). However, when I do the same for the tiled figure the hover text is messed up - now the 'Count' field says either '{y}' (without fetching the actual number) or a value from the FPTP column.

Why do the histogram counts go missing? What am I missing?

EDIT: I've added my solution as an answer, but for more context I am adding an image of what it looked like before.

Wrong output (count=%{y} and mashed rug plot)

The input to create_plot() was just a pair of lists in the format ['TP',27,284,74,483,374,493,394,12,10,902,83] etc. where the string label in position 0 is either TP or FP (True/False Positive) in this iteration. The resulting figures were the input to make_tiled_figure() along with a single string representing a QC metric e.g. 'DP'. The calculate_centiles() function just returns a list of numbers the same length as its input (that will be displayed by the hovertext), and make_subplots() is provided by plotly.subplots

Chris
  • 13
  • 4
  • Can you provide current output and reproducible data? If you can, I believe we can get a variety of comments and answers. – r-beginners Aug 20 '22 at 04:05

1 Answers1

1

I ended up constructing a larger dataframe (passing multiple data lists to the create_plot function and concatenating them) and using px.histogram's facet_col argument to create the tiling instead. This fixed the counts issue and also moved the rug plot to separate axes (I now realise it was artificially squashed on top of the main histogram plot).

This doesn't solve the problem of data/layout being lost when inserted into subplots, but does provide the end result I was looking for. Combined with facet_row wold presumably make a 2D tiled figure.

def create_plot(plot_list, metric):
'''
Given a list of plot datasets, make a tiled image of all the datasets
as histograms.

INPUT
1 list of lists (each with 2 data lists and a filter label), one string
RETURN
1 plotly figure object
'''
# initialise concat lists
values = []
centiles = []
tpfp = []
subset = []
# for each dataset, grab the labels (TP/FP - need to be separated from the
# main data) and filter (e.g. snp_het), then append to the relevant
# concatenated lists initialised above
for plot in plot_list:
    plot_name = plot[-1]
    list1 = plot[0]
    list2 = plot[1]
    label1 = list1.pop(0)
    label2 = list2.pop(0)
    values = values + (list1 + list2)
    centiles = centiles + (
        list(calculate_centiles(list1)) + list(calculate_centiles(list2))
        )
    tpfp = tpfp + (([label1] * len(list1)) + ([label2] * len(list2)))
    subset = subset + ([plot_name] * len(list1 + list2))
# make dataframe using lists above for columns
df = pd.DataFrame(
    {'Metric Value': values, 'TPFP': tpfp, 'Centile': centiles,
     'Variant Type': subset}
    )
# make figure (facet_col tiles the datasets based on filter subset)
fig = px.histogram(
    df, x='Metric Value', color='TPFP', facet_col='Variant Type',
    hover_data=[df.columns[2]], marginal='rug', barmode='overlay'
    )
# format the hovertext to display centiles only on the rug plot and counts
# only on the histogram (rug plots have odd index in histogram)
for i, trace in enumerate(fig['data']):
    group = trace['legendgroup']
    if i % 2 == 0:
        trace['hovertemplate'] = (
            f'True/False Positive={group}<br>'
            'Bin=%{x}<br>Count=%{y}<extra></extra>'
            )
    else:
        trace['hovertemplate'] = (
            '<br>Metric value=%{x}<br>'
            'Centile=%{customdata[0]}<br><extra></extra>'
            )
# add vertical line to hover action
fig.update_layout(hovermode='x unified')
# specify figure dimensions and titles
fig.update_layout(
    height=700, width=2000, title_text=metric, showlegend=False
    )
return fig

Fixed output

Chris
  • 13
  • 4