0

I'm trying to think of a way to most effectively display the following analysis. I'm using Python and Plotly for other analysis and would like to stick with that.

Say I have a number of newspapers. Each newspaper has a different amount of circulation worldwide. Within that, some percentage of the circulation is from the US. And within that, some percentage is from a given state, say, California.

I'd like to have a bar graph that shows, for one journal:

  • Total circulation (say, 1M)
  • Of that, how much is US circluation (say, 500k)
  • Of that, how much is from California (say 100k)

So I want a compact way to show

  1. What percentage of total is from US (50%)
  2. What percentage of US is from CA (20%)
  3. What percentage of total is from CA (10%)

Then repeat for each journal and look for trends.

Plotly has a stacked bar chart which looks close, but I want to customize to specifically call out the three percentages. Each newspaper has a different total number, so a stacked bar chart normalized to 100% won't tell me the magnitude of each different newspaper.

I was thinking total %'s on the left of the bar, and US-specific %'s on the right of the bar. Or different colors?

Any advice is appreciated.

---Edit to add MWE---

import pandas as pd
import plotly.express as px

data = {'Name':['Paper A', 'Paper B'],
     'Total circ':[1000000, 800000],
      'US circ':[500000, 200000],
      'CA':[100000, 100000]
     }
df = pd.DataFrame.from_dict(data)
df['not CA'] = df['US circ'] - df['CA']
df['not US'] = df['Total circ'] - df['US circ']

fig = px.bar(df, x='Name', y=['not US', 'not CA', 'CA'], text_auto=True)

enter image description here(https://ibb.co/D5123xk)

eschares
  • 73
  • 7
  • 2
    What have you tried so far? Where is your code? What step specifically are you stuck with? Can you provide sample data? – BeRT2me Oct 10 '22 at 16:08
  • I'm stuck with a) how to get percentages to appear, not counts and b) how to organize the graph so it's easy to read since different percentages will be referring to different totals – eschares Oct 10 '22 at 16:28
  • Does this answer your question? [Adding percentage of count to a stacked bar chart in plotly](https://stackoverflow.com/questions/65233123/adding-percentage-of-count-to-a-stacked-bar-chart-in-plotly) – m13op22 Oct 10 '22 at 17:13

1 Answers1

1

I have created code to add the composition ratios as labels assuming the bar chart you have created. Convert the presented data frame from wide format to long format. It then calculates the composition ratios for each group. Next, with the data frame extracted to only the regions you need, use the graph object to create a stacked bar chart, extracted by region. Labels are created from a dedicated list of the number of copies and composition ratios. The labels are specified in the loop process.

df = df.melt(id_vars='Name', value_vars=df.columns[1:], var_name='Region', value_name='Volume')
df['percentage'] = df.groupby(['Name'])['Volume'].apply(lambda x: 100 * x / float(x.head(1))).values
df = df[(df['Region'] != 'Total circ') & (df['Region'] != 'US circ')]

new_labels = ['{}k({}%)'.format(int(v/1000),p) for v, p in zip(df.Volume,df.percentage)]

import plotly.graph_objects as go

fig = go.Figure()

for i,r in enumerate(df['Region'].unique()):
    dff = df.query('Region == @r')
    #print(dff)
    fig.add_trace(go.Bar(x=dff['Name'], y=dff['Volume'], text=[new_labels[i+i*1],new_labels[1+i*2]], name=r))
    
fig.update_layout(barmode='stack', autosize=True, height=450)

fig.show()

enter image description here

r-beginners
  • 31,170
  • 3
  • 14
  • 32