2

I have this dataset (an made-up example, but in the same structure):

data have;
    infile datalines delimiter=',';
    length country city measure $50.;
    input country $ city $ level measure $ mdate total;
    informat mdate date9.;
    format mdate date9.;
    datalines;
England,London,1,Red doors opened,24MAR2014,4
England,London,1,Green doors opened,24MAR2014,6
England,London,2,Doors closed,24MAR2014,7
England,London,1,Red doors opened,25MAR2014,5
England,London,1,Blue doors opened,25MAR2014,4
England,London,1,Green doors opened,25MAR2014,3
England,London,2,Doors closed,25MAR2014,6
England,Manchester,1,Red doors opened,24MAR2014,3
England,Manchester,2,Doors closed,24MAR2014,1
England,Manchester,2,Doors closed,25MAR2014,4
Scotland,Glasgow,1,Red doors opened,24MAR2014,4
Scotland,Glasgow,1,Red doors opened,25MAR2014,3
Scotland,Glasgow,1,Green doors opened,25MAR2014,2
Scotland,Glasgow,2,Doors closed,25MAR2014,4
;;;;
run;

I want to output the 'doors opened' per country/city per day, then subtotal the doors opened, then output the doors closed, then subtract the doors opened from the doors closed to give a 'balance' (per country/city). At the end of each country, I want one line summing the balance (per day) for each country.

So the above would give something like:

Country  + City       + Measure            + 24MAR2014 + 25MAR2014
---------+------------+--------------------+-----------+----------
England  + London     + Red doors opened   +      4    +    5
         +            + Green doors opened +      6    +    3
         +            + Blue doors opened  +      .    +    4
         +            + TOTAL DOORS OPENED +     10    +   12
         +            + Doors closed       +      7    +    6
         +            + BALANCE            +     -3    +   -6
         + Manchester + Red doors opened   +      3    +    .
         +            + TOTAL DOORS OPENED +      3    +    .
         +            + Doors closed       +      1    +    4
         +            + BALANCE            +     -2    +    4
         + ALL        + BALANCE            +     -5    +   -2
Scotland + Glasgow    + Red doors opened   +      4    +    3
         +            + Green doors opened +      .    +    2
         +            + TOTAL DOORS OPENED +      4    +    5
         +            + Doors closed       +      .    +    4
         +            + BALANCE            +     -4    +   -1
         + ALL        + BALANCE            +     -4    +   -1

I've deliberately left it so not every measure appears for each instance and the Doors Closed total is sometimes missing. The rows in CAPS are those I want to add with PROC REPORT, i.e. not in the original data.

I've got the basic layout using PROC REPORT, but don't really have an idea where to go to start inserting subtotals on demand. I've added a 'level' variable, to try and give me something to order/group on.

I need one country per output page and the rows kept in that order per grouping, i.e. XXX Doors Opened, TOTAL DOORS OPENED, Doors Closed, BALANCE, so I think maybe the extra columns are needed.

So far, this is what I have done:

proc report data=have out=proc;
    by country;
    columns city level measure mdate,total;
    define city / group;
    define level / group noprint;
    define measure / group;
    define mdate / across;
    define total / analysis sum;

    compute before level;
        endcomp;

    compute after level;
        if level = 2 and break = '_level_' then do;
            measure = 'TOTAL DOORS OPENED';
            end;
        endcomp;
run;

I know I should be able to do something using the level variable, so I've added some compute blocks before and after it and examined the output dataset. I've tried to add a value of 'TOTAL DOORS OPENED', but this isn't working.

To be honest, I've only just started using PROC REPORT, so this is a bit out of my comfort zone.

Thanks for any help. Please let me know if the question isn't clear.

seestevecode
  • 111
  • 2
  • 11
  • 1
    You'll find that showing your current effort would make your question better received by the community. Generally questions that come across as - "here's a spec - write me some code" are not normally well received. – Jon Clements Apr 28 '14 at 10:20
  • Thanks @JonClements. I can see how that came across that way; that wasn't my intention. The reason for the lack of detail was because I am really at a bit of a loss as to where to go. I have made very little progress, but will amend my original question to reflect that which I have made. – seestevecode Apr 28 '14 at 10:52
  • thanks.. I'm afraid my SAS is about 10 years rusty, so I'm really not au fait enough these days to answer, but certainly have a +1 from me for including your attempt - at the very least it means someone can copy/paste and change rather than having to suggest something from scratch as to what you're doing/going wrong with kind of thing... – Jon Clements Apr 28 '14 at 13:38
  • I think this is at least theoretically possible in `PROC REPORT`, but it's probably easier to do with preprocessing in a data step. Are you amenable to that? – Joe Apr 28 '14 at 15:36
  • Hi Joe. Yes absolutely amenable to a DATA step. I have, in fact, managed to get pretty much what I want with a number (too many, IMHO) of PROC SQLs and DATA steps. But I think it can be done better and am always keen to learn more if you have a solution you'd be happy to share. When I'm back at work tomorrow, I'll post my solution, such as it is. Thanks for your time. – seestevecode Apr 28 '14 at 21:29

1 Answers1

1

Sometimes (often for my field of work) it is better to regard PROC REPORT as a fancy PROC PRINT and make your calculations in the dataset.

I would added a variable like TYPE denoting if the entry tells us about the open or closed doors then calculated the sums by contry/city/level/type/day; also I would duplicated all observations with level= 3 (meaning BALANCE in your table) and negated the measures where TYPE=closed then calculated the sums by country/city/day, they stacked the all results together in one dataset with proper keys and transposed with ID=day. PROC REPORT can take it from there. Do not trust COMPUTE blocks too much, they are often useful but hell to debug. Just make a dataset what appears as your desired table and throw it to REPORT.

Anton
  • 164
  • 1
  • 4
  • Hi Anton. Thanks for your answer. This is pretty much what I've done - see my comment to Joe in the question. I'll post my answer tomorrow. Thanks for your time. – seestevecode Apr 28 '14 at 21:31