-1

I have a Dataframe with a column (say 'Col') with values either from this list ['PO101','NI101','NE101'].

Count is:

  • PO101 = 30000
  • NI101 = 5000
  • NE101 = 3000

I am trying to show how many are which on a stacked bar chart.

I created the stacked chart using following code:

df.assign(dummy=1).groupby(['dummy','Col']).size().to_frame().unstack().plot(
    kind='bar',
    stacked=True,
    legend=True)

This creates the chart but the legend has weird tuple values with dummy included as below: enter image description here

So I turned the legend as False in the above and used the following manner to generate new legend.

current_handles, _ = plt.gca().get_legend_handles_labels()
reversed_handles = reversed(current_handles)
labels = reversed(df['Col'].unique())
plt.legend(reversed_handles,labels,loc='lower right')

This generated the legend with proper names, however, it doesn't show me the correct color code as seen below: enter image description here

Green (which is the largest portion in the chart) should have been PO101, instead it shows as NI101.

Can someone explain me why?

I think that the order which it follows to generate the chart and the order with which unique values are listed might be different.

Appreciating some guidance here.

EDIT: Attaching screenshots of the chart output for your reference.

Meet
  • 461
  • 4
  • 19
  • This happens even when I changed the data PO101,NI101,NE101 to 1,0,-1. But it was solved if I changed the values to A,B,O respectively. – Meet Mar 05 '21 at 16:12
  • *Dataframe has a column (say 'Col') with values either from this list ['PO101','NI101','NE101'] and count is...* I think you can/should just creates a small sample with counts like `3,4,5` so as **your code can run** and put that into the question. – Quang Hoang Mar 05 '21 at 16:20
  • Can I add the output chart in the question? As the code only generates the output chart. I couldn't find a way to upload an image here. – Meet Mar 05 '21 at 16:27
  • see [this guide](https://meta.stackoverflow.com/questions/344851/how-do-you-add-a-screenshot-image-to-your-stack-overflow-post). – Quang Hoang Mar 05 '21 at 16:27
  • Not my DV, but asking someone to create a dummy dataset instead of providing one yourself might be the reason why it was downvoted. – BigBen Mar 05 '21 at 16:53

1 Answers1

1

Try:

(df['col'].value_counts()
  .to_frame().T
  .plot.bar(stacked=True)
)

You would get something similar to this:

enter image description here

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • Thanks Quang. If you can, will you pls also tell me how to add another column to this same plot to create a two bar stacked chart? In my earlier attempted code, I could add multiple column in the groupby instead of dummy to get dual stacked bars. How do I do that in this version? – Meet Mar 05 '21 at 17:02
  • @Meet Does that other column have the same three values? Again, you **should** include a sample data representing your **complete** problem. You should not ask a partial question, then extend it later like this. – Quang Hoang Mar 05 '21 at 17:04
  • Ok. I can see other questions where related queries are discussed in the comment section so that one doesn't need a new question for another point. Though I will keep this in mind the next time I ask the question. – Meet Mar 05 '21 at 17:07
  • I could get the solution. Thanks though. For others, groupby() and then unstack() will help get this done with Quang's version. – Meet Mar 05 '21 at 17:08
  • @Meet Yes, that's the solution I would go for. Still, my previous comment is valid as others wouldn't be likely looking at your comment above :-) – Quang Hoang Mar 05 '21 at 17:09
  • True. I though go through the whole post to understand the question and the responses given. I even go through multiple solutions under the same question, and not just one marked as solved, to learn different techniques to do the same thing. Others might not do that. True. – Meet Mar 05 '21 at 17:10
  • Going with this solution has created xticks in the format (0,label). Now I am trying to create a new question to get it in format of label and remove this (0,). Though I will have to wait 90 minutes to ask another question. If you'd be kind to help with this. – Meet Mar 05 '21 at 17:26
  • Ok. Adding a .droplevel after .T will remove that. Sorry for asking the query. – Meet Mar 05 '21 at 17:31