1

I am new to Stata and I will be grateful if someone can help me figuring out how to add additional labels in my bar chart.

I have frequencies for 5 categories (let's say Apple, Orange, Banana, Grape, Lemon) and would like to make a bar chart comparing percentages of these categories (scale on Y is percentage) also would like to add frequency data outside of the bar as label.

However, I would like to incorporate more data associated with each category on the chart:

I have two indexes (CS and DS) calculated in Excel, values between 0 and 1 and between -1 and 0 respectively, so for each category I have two CS and DS indexes and I need to show them inside each bar for each category.

So:

var1 (categories): Apple, Orange, Banana, Grape, Lemon
var2 (frequencies): 65, 20, 1, 0, 39
var3 (CS index): 0.25, 0.12, 0, 0.42, 0.09
var4 (DS index): -0.15, -0.46, 0, -0.12, -0.2

It seems that I need to run a twoway command but it failed.

I have used the code below to arrive roughly what I want to compare categories but I don't know how to add other data to that:

 graph hbar (sum) var2, over(var1) blabel(bar, format(%9.3g)) asyvars
 percentages  showyvars bar(1, color(gs6)) bar(2, color(gs6)) bar(3,
 color(gs6)) bar(4, color(gs6)) bar(5, color(gs6)) bar(6, color(gs6))
 legend (off) bargap(100) ytitle("Percentage", size(3.5))
 graphregion(fcolor(white))  plotregion(margin(zero))
Nick Cox
  • 35,529
  • 6
  • 31
  • 47
ashkan
  • 113
  • 2
  • 8
  • I forgot to mention that the bar should be horizontal and percentage means share of total fruits. – ashkan Apr 22 '15 at 08:20

1 Answers1

3

Key point: This problem is made easiest by switching to twoway bar and showing extra text using string variables as marker labels.

We can't comment on what was wrong with your twoway code, as you don't show it.

In detail, your example and your design impose impossible demands. One bar must be of zero length and another is very small, so you can't put extra text inside them. But this code segment shows some technique:

clear 
set scheme s1color 

input str6 fruit frequency CS DS 
Apple    65    0.25   -0.15 
Orange   20    0.12   -0.46
Banana   1     0      0
Grape    0     0.42   -0.12
Lemon    39    0.09   -0.2 
end 

encode fruit, gen(Fruit)
su frequency
gen percent = 100 * frequency/r(sum)

twoway bar percent Fruit, horizontal barw(0.8) yla(1/5, valuelabel ang(h) noticks) bfcolor(none)

gen x = 3
gen text1 = "CS:" + string(CS, "%4.2f") 
gen text2 = "DS:" + string(DS, "%4.2f")
gen y1 = Fruit + 0.2
gen y2 = Fruit - 0.2 

twoway bar percent Fruit, horizontal barw(0.8) yla(1/5, valuelabel ang(h) noticks) bfcolor(none) ///
|| scatter y1 x , ms(none) mlabpos(3) mlab(text1) ///
|| scatter y2 x , ms(none) mlabpos(3) mlab(text2) legend(off) xtitle(percent) ///
|| scatter Fruit percent, ms(none) mlabpos(3) mlab(percent) xsc(r(0, 56)) 

This is the resulting graph:

enter image description here

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • thanks dear Nick it works perfectly. one more quick question: as I mentioned in above, I would like to have frequencies as my bar labels (instead of percentage) but Y axes and bar size shows the percentages. in fact what I want is replace the percentage bar label with frequency bar label. – ashkan Apr 22 '15 at 11:10
  • On this graph, it is the x axis that shows percents. But all you need to do is change which variable is shown as marker label. – Nick Cox Apr 22 '15 at 11:49
  • I need to reverse the order of categorical data (i.e. Apple in the top for instance). I have tried reverse command or xalternate but the options not allowed.can you please help me to put proper command.thanks – ashkan Apr 23 '15 at 09:07
  • You need the option `ysc(reverse)`. – Nick Cox Apr 23 '15 at 09:13
  • Thanks again and I guess I found the way I should accept. please let me know if I missed something else. by the way I could not find a command tochange the gap width between bars in twoway bar plots . is there any way to do so, – ashkan Apr 23 '15 at 10:03
  • That's fine; thanks. To change the gap between bars, change the `barwidth()` setting. In the example above it is 0.8 and that is geared to the fact that the variable concerned has values 1(1)5. – Nick Cox Apr 23 '15 at 10:37