0

I have two plots that I would like to merge into one. Each plot represents the proportion of present / not-present observations by their corresponding cumulative test results for the year

So on the plot I would like to see bars, side by side for groups of test scores but counting number of present to not-present

To represent this problem, this is what I have currently:

data test_scores;
    do i = 1 to 200;
        score = ranuni(200);
        output;
    end;
    drop i;
run;

data test_scores_2;
    set test_scores;
    if _n_ le 100 then flag = 0;
    else flag = 1;
run;

data test_scores_2_0 test_scores_2_1;
    set test_scores_2;
    if flag = 0 then output test_scores_2_0;
            else if flag = 1 then output test_scores_2_1;
run;

PROC GCHART 
    DATA=test_scores_2_0 
    ;
    VBAR 
    score
    /
    CLIPREF
    FRAME   
    LEVELS=20
    TYPE=PCT
    COUTLINE=BLACK
    RAXIS=AXIS1
    MAXIS=AXIS2
;
RUN;
QUIT;

PROC GCHART 
    DATA=test_scores_2_1
    ;
    VBAR 
    score
    /
    CLIPREF
    FRAME   
    LEVELS=20
    TYPE=PCT
    COUTLINE=BLACK
    RAXIS=AXIS1
    MAXIS=AXIS2
;
RUN;
QUIT;

bars should sum up to 100% for present bars should sum up to 100% for non-present

TIA

Stu Sztukowski
  • 10,597
  • 1
  • 12
  • 21
78282219
  • 85
  • 1
  • 8
  • Your example data has some errors and does not appear to output correctly. There are also some unresolved macro variables. Can you take a look at your example data again? – Stu Sztukowski Mar 26 '19 at 15:02
  • Fixed, didn't realise I had those macro variables active – 78282219 Mar 26 '19 at 16:35

2 Answers2

1

proc sgplot to the rescue. Use the group= option to specify two separate groups. Set the transparency to 50% so one histogram does not cover the other.

proc sgplot data=test_scores_2;
    histogram score / group=flag transparency=0.5 binwidth=.05;
run;

enter image description here

Stu Sztukowski
  • 10,597
  • 1
  • 12
  • 21
1

With Proc GCHART you can use VBAR options GROUP= and G100 to get bars that represent percent within group. This is useful when the groups have different counts.

The SUBGROUP= option splits the vertical bar according to the different values of the subgroup variable, and produces automatic coloration and legend corresponding to the subgroups.

When the SUBGROUP variable (or values) correspond 1:1 to the group the result is a chart with a different color for each group and a legend corresponding to the group.

For example, modify your data so group 1 has a 50 count and group 2 has 150 count:

data test_scores;
    do _n_ = 1 to 200;
        score = ranuni(200);
        flag = _n_ > 50;
        output;
    end;
run;

axis1 label=("score");
axis2 ;
axis3 label=none value=none;

PROC GCHART data=test_scores;
  VBAR score
  / levels=10

     GROUP=flag    G100
  SUBGROUP=flag

  SPACE=0 TYPE=PERCENT freq gaxis=axis3 maxis=axis1 ;
run;

Output

enter image description here

Similar chart showing the effect of a subgroup variable with values different than group values.

data test_scores;
    do _n_ = 1 to 200;

        subgroup = ceil(5 * ranuni(123));     * random 1 to 5;

        score = ranuni(200);
        flag = _n_ > 50;
        output;
    end;
run;

axis1 label=("score");
axis2 ;
axis3 label=none value=none;

PROC GCHART data=test_scores;
  VBAR score
  / levels=10

     GROUP=flag G100 
  SUBGROUP=subgroup   /* has integer values in [1,5] */

  SPACE=0 TYPE=PERCENT freq gaxis=axis3 maxis=axis1;
run;

enter image description here

Richard
  • 25,390
  • 3
  • 25
  • 38