1

I wants to show percentages for bar plots using plotnine with facet_wrap and stat = 'count'.
(Of course I can do it with preparation of values and stat = 'identity', but I want to avoid it.)
When I give the arg of facet_wrap to aes, I can refer it in after_stat.
But it needs to nullify the aes manually. it seems ridiculous.
Is there a better way to do it ?

Any help would be greatly appreciated. Below is an example;

from plotnine import *
from plotnine.data import mtcars
import pandas as pd

def prop_per_xcc(x, color, count):
    df = pd.DataFrame({'x': x, 'color': color, 'count': count})
    prop = df['count']/df.groupby(['x', 'color'])['count'].transform('sum')
    return prop

facet_num = mtcars.vs.nunique()

print(
    ggplot(mtcars, aes('factor(cyl)', fill='factor(am)')) + 
    geom_bar(position='fill') + 
    geom_text(aes(color = "factor(vs)",  # sets arg of facet wrap to refer in after_stat
                  label = after_stat('prop_per_xcc(x, color, count) * 100')),
              stat = 'count',
              position = position_fill(vjust = 0.5),
              format_string = '{:.1f}%',
              show_legend = False) +
    scale_color_manual(values = ["black"] * facet_num) +  # nullify the aes manually
    facet_wrap("vs")
)

enter image description here

cuttlefish44
  • 6,586
  • 2
  • 17
  • 34

1 Answers1

1

You are thinking correctly, but just don't know about the right syntax. You should use staged aesthetic evaluation.

from plotnine import *
from plotnine.data import mtcars
import pandas as pd

def prop_per_xcc(x, label, count):
    df = pd.DataFrame({'x': x, 'label': label, 'count': count})
    prop = df['count']/df.groupby(['x', 'label'])['count'].transform('sum')
    return prop


facet_num = mtcars.vs.nunique()

print(
    ggplot(mtcars, aes('factor(cyl)', fill='factor(am)')) + 
    geom_bar(position='fill') + 
    geom_text(aes(label = stage(start="vs", after_stat='prop_per_xcc(x, label, count) * 100')),
              stat = 'count',
              position = position_fill(vjust = 0.5),
              format_string = '{:.1f}%',
              show_legend = False) +
    facet_wrap("vs")
)

Result

It says, the label is mapped to the same variable as the facets, then refined by the custom function after the statistics are calculated.

has2k1
  • 2,095
  • 18
  • 16