0

I created the following graph using sgplot

proc sgplot data=Colordistricts;
hbar distrct/response=Percent 
group= population;
run;  

enter image description here

However, it seems that the individual population groups are arranged in alphabetical order in the graph (Asian followed by Black Color and White).

How do I create this same plot with the population groups in the descending order by percent?

In fact these are districts where the color population is highest. Basically I want to create a graph so that each bar begins with the color population

Reeza
  • 20,510
  • 4
  • 21
  • 38
Nathan123
  • 763
  • 5
  • 18
  • 1
    Try CategoryOrder or GROUPORDER? What's your version of SAS, the options differ based on your version. – Reeza Apr 22 '18 at 02:11
  • 1
    How should the other population segments be ordered after placing "Color" in the first population position ? Sorting by a specific percent can disorder the population item (colored bars at different places), and sorting by population item can disorder the percents (jaggy same colored bars) – Richard Apr 22 '18 at 14:20
  • @Reeza I have SAS 9.4 – Nathan123 Apr 22 '18 at 22:58
  • @Richard I want the order to be color, white, black and asian – Nathan123 Apr 22 '18 at 23:00
  • @Nathan123 you'd have to provide the exact version, ie 9.4 TS1M5 or M4. There are significant improvements after M3 and in M5. – Reeza Apr 23 '18 at 03:40
  • @Reeza I have 9.04.01M3P062415 – Nathan123 Apr 23 '18 at 14:45

1 Answers1

0

To force a specific group value to the first position you can map the desired group to a new value that will collate first. Sometimes this is easily done by placing a space character in front of an existing value.

If the group variable is a numeric ID custom formatted to display an associated group label you can create a new version of the custom format to include a 0 id that corresponds to the forced group. The forced group is mapped to the 0 id.

You would then sort the data in the particular way you need and use SGPLOT yaxis type=discrete discreteOrder=data; to force the hbar categories to appear in the particular order.

Here is some sample code to explore. The final SGPLOT uses the mapping technique to force a particular population segment to appear first.

ods html close;

%let path = %sysfunc(pathname(work));
ods html file="&path.\sgplot_hbar.html" gpath="&path.";

proc format;
  value popId
  0 = 'Color'
  1 = 'Asian'
  2 = 'Black'
  3 = 'Color'
  4 = 'White'
;

data have;
  do _n_ = rank('A') to rank('P');
    district = byte (_n_);
    x = 0;
    populationID = 2; percent = ceil(40*ranuni(123)); output;
    x + percent;
    populationID = 3; percent = ceil(40*ranuni(123)); output;
    x + percent;
    if (ranuni(123) < 0.10) then do;
    populationID = 1; percent = ceil(10*ranuni(123)); output;
    x + percent;
    end;
    percent = 100 - x;
    populationID = 4;
    output;
  end;
  keep district populationID percent;
  label
    percent = 'Percent of Total Frequency'
  ;
  format
    populationID popId.
  ;
run;

proc sgplot data=have;
  hbar district
  / group = populationID
    response = percent
  ;
  title j=L 'default group order by populationID value';
  title2 j=L 'districts (yaxis) also implicitly sorted by formatted value';
run;

proc sgplot data=have;
  hbar district
  / group = populationID
    response = percent
    categoryOrder = respAsc
  ;
  title j=L 'categoryOrder: ascending response';
  title2 j=L 'districts (yaxis) also implicitly sorted by min(response)';
run;

proc sgplot data=have;
  hbar district
  / group = populationID
    response = percent
    categoryOrder = respDesc
  ;
  title j=L 'categoryOrder: descending response';
  title2 j=L 'districts (yaxis) also implicitly sorted by descending max(response)';
run;

proc sql;
  create table have2 as
  select 
    case 
      when populationID = 3 then 0 else populationID
    end as hbar_populationID format=popId.
  , *
  from have
  order by 
    hbar_populationID, percent
  ;
quit;

proc sgplot data=have2;
  yaxis type=discrete discreteOrder=data;

  hbar district
  / group = hbar_populationID
    response = percent
  ;

  title j=L 'population seqment ordering is partially forced by tweaking populationID values';
  title2 j=L 'districts in data order per yaxis statement';
run;

Forced groupOrder

SQL can sort the data in a particular order by using a case in the order by clause. You would then use groupOrder=data in SGPLOT.

proc sql;
  create table have3 as
  select *
  from have
  order by 
    district
  , case 
      when populationID = 3 then 0
      when populationID = 2 then 1
      when populationID = 4 then 2
      when populationID = 1 then 3
      else 99
    end
  ;
quit;

proc sgplot data=have3;
  hbar district
  / group = populationID
    groupOrder = data
    response = percent
  ;

  title j=L 'population seqment ordering is partially forced by tweaking populationID values';
  title2 j=L 'districts in data order per yaxis statement';
run;

Forcing one segment to be first and then the other segments relying on response values

After mapping populationID 2 to 0 you could force the remaining population segments to be ordered similar to respAsc or respDesc. That process would require additional coding to determine new mappings for the other populationID values. This additional example shows how the global response sum is used to force a descending order on the remaining population segments within a district.

proc sql;
  create table way as 
  select populationID, sum(percent) as allPct
  from have
  where populationID ne 3
  group by populationID
  order by allPct descending
  ;

data waySeq;
  set way;
  seq + 1;
run;

proc sql;
  create table have3 as
  select
    have.*
  , case 
      when have.populationID = 3 then 1000 else 1000+seq
    end as hbar_populationID
  from have
  left join waySeq on have.populationID = waySeq.populationID
  order by 
    hbar_populationID, percent
  ;

  create table fmtdata as
  select distinct 
    hbar_populationID as start
  , put(populationID, popId.) as label
  , 'mappedPopId' as fmtname
  from have3;
quit;

proc format cntlin = fmtdata;
run;

%let syslast = have3;

proc sgplot data=have3;
  yaxis type=discrete discreteOrder=data;

  hbar district
  / group = hbar_populationID
    response = percent
    groupOrder = data
  ;

  format hbar_populationID mappedPopId.;

  title j=L 'population seqment ordering is partially forced by tweaking populationID values';
  title2 j=L 'districts in data order per yaxis statement';
run;

title;
Richard
  • 25,390
  • 3
  • 25
  • 38