4

I'm trying to add the number of each group into the labels of groups on xaxis using Proc sgplot in SAS. Here is the data and the graph I'd like to have. I want to have sample of each bar on xaxis (hand writen parts). Your help is very much appreciated!

Data have ;
input type  sex $  n n_total percent ;
datalines;
0  F  6    29  20.7 
1  F  387  496 78.2  
0  M  4    15  26.6
1  M  264  305 86.5
;
Run; 

proc sgplot data=have ;
vbarparm category= type  response=percent /group=sex groupdisplay=cluster datalabel;
run; 

Graph that I want to create: enter image description here

Reeza
  • 20,510
  • 4
  • 21
  • 38
kiabar
  • 63
  • 6
  • Thanks for the simple to use datalines - made this trivial to help with! – Joe Feb 10 '21 at 21:49
  • Thais is a very confusing graph. Why is the bar with N=29 smaller than the bar with N=15? – Tom Feb 11 '21 at 04:23

2 Answers2

3

You can compute a bar data label that shows the percent value and the text n=<N>

Example:

Data have ;
input type  sex $  n n_total percent ;
datalines;
0  F  6    29  20.7 
1  F  387  496 78.2  
0  M  4    15  26.6
1  M  264  305 86.5
;
Run; 

data plot;
  set have;
  barlabel = cats(percent) || ' %N/(n=' || cats(n_total, ')');
run;

proc sgplot data=plot;
  vbarparm 
    category=type  
    response=percent 
  / group=sex 
    groupdisplay=cluster 
    datalabel=barlabel datalabelfitpolicy=splitalways splitchar='/'
  ;
  label percent = 'Percent N having some attribute';
run; 

enter image description here

Richard
  • 25,390
  • 3
  • 25
  • 38
  • Thanks Richard. Your suggestion woks. The only think that I am thinking of is that placing (n=XX) right beside the % on the bars might be misleading and readers may interpret this n as the frequency of that percent not the total n. So, I think if we put the N in the X axis (bottom of bars) the data would be clearer. Thanks again. – kiabar Feb 10 '21 at 15:11
  • You might perceive the separation as cleaner, however placing the text on the graph makes the viewer perceive the text as something important, perhaps more important than the bar height itself. Once a view sees %, the next mind question is % of what ? By separating the the what (n=##) away from the % the viewer actual has more mind work to do to find it. If you have n= directly adjacent (underneath) the percent domain is readily apparent. – Richard Feb 10 '21 at 15:49
  • 1
    If you haven't read Edward Tufte ("noted for his writings on information design and as a pioneer in the field of data visualization" - Wikipedia) check out his publications at www.edwardtufte.com – Richard Feb 10 '21 at 15:49
2

While I agree with Richard that you might be better off putting this in the bar label, it's easy to put it in an axis table as well.

Data have ;
input type  sex $  n n_total percent ;
datalines;
0  F  6    29  20.7 
1  F  387  496 78.2  
0  M  4    15  26.6
1  M  264  305 86.5
;
Run; 

proc sgplot data=have ;
vbarparm category= type  response=percent /group=sex groupdisplay=cluster datalabel;
xaxistable n_total/class=sex classdisplay=cluster position=bottom location=inside colorgroup=sex;
run; 

classdisplay=cluster makes the values spread out like the bars do, location=inside puts it right below the bar, and colorgroup=sex makes it colored like the bars (instead of black). position=bottom is the default, just highlighting that option.

You could further customize what shows up if you wanted to in the same manner Richard did - by creating a text variable that contains exactly what you want to display for each, and using that as your variable (the first argument to xaxistable). That variable can be numeric or charcter.

Joe
  • 62,789
  • 6
  • 49
  • 67