I have this dataset (an made-up example, but in the same structure):
data have;
infile datalines delimiter=',';
length country city measure $50.;
input country $ city $ level measure $ mdate total;
informat mdate date9.;
format mdate date9.;
datalines;
England,London,1,Red doors opened,24MAR2014,4
England,London,1,Green doors opened,24MAR2014,6
England,London,2,Doors closed,24MAR2014,7
England,London,1,Red doors opened,25MAR2014,5
England,London,1,Blue doors opened,25MAR2014,4
England,London,1,Green doors opened,25MAR2014,3
England,London,2,Doors closed,25MAR2014,6
England,Manchester,1,Red doors opened,24MAR2014,3
England,Manchester,2,Doors closed,24MAR2014,1
England,Manchester,2,Doors closed,25MAR2014,4
Scotland,Glasgow,1,Red doors opened,24MAR2014,4
Scotland,Glasgow,1,Red doors opened,25MAR2014,3
Scotland,Glasgow,1,Green doors opened,25MAR2014,2
Scotland,Glasgow,2,Doors closed,25MAR2014,4
;;;;
run;
I want to output the 'doors opened' per country/city per day, then subtotal the doors opened, then output the doors closed, then subtract the doors opened from the doors closed to give a 'balance' (per country/city). At the end of each country, I want one line summing the balance (per day) for each country.
So the above would give something like:
Country + City + Measure + 24MAR2014 + 25MAR2014
---------+------------+--------------------+-----------+----------
England + London + Red doors opened + 4 + 5
+ + Green doors opened + 6 + 3
+ + Blue doors opened + . + 4
+ + TOTAL DOORS OPENED + 10 + 12
+ + Doors closed + 7 + 6
+ + BALANCE + -3 + -6
+ Manchester + Red doors opened + 3 + .
+ + TOTAL DOORS OPENED + 3 + .
+ + Doors closed + 1 + 4
+ + BALANCE + -2 + 4
+ ALL + BALANCE + -5 + -2
Scotland + Glasgow + Red doors opened + 4 + 3
+ + Green doors opened + . + 2
+ + TOTAL DOORS OPENED + 4 + 5
+ + Doors closed + . + 4
+ + BALANCE + -4 + -1
+ ALL + BALANCE + -4 + -1
I've deliberately left it so not every measure appears for each instance and the Doors Closed total is sometimes missing. The rows in CAPS are those I want to add with PROC REPORT
, i.e. not in the original data.
I've got the basic layout using PROC REPORT
, but don't really have an idea where to go to start inserting subtotals on demand. I've added a 'level' variable, to try and give me something to order/group on.
I need one country per output page and the rows kept in that order per grouping, i.e. XXX Doors Opened, TOTAL DOORS OPENED, Doors Closed, BALANCE, so I think maybe the extra columns are needed.
So far, this is what I have done:
proc report data=have out=proc;
by country;
columns city level measure mdate,total;
define city / group;
define level / group noprint;
define measure / group;
define mdate / across;
define total / analysis sum;
compute before level;
endcomp;
compute after level;
if level = 2 and break = '_level_' then do;
measure = 'TOTAL DOORS OPENED';
end;
endcomp;
run;
I know I should be able to do something using the level variable, so I've added some compute blocks before and after it and examined the output dataset. I've tried to add a value of 'TOTAL DOORS OPENED', but this isn't working.
To be honest, I've only just started using PROC REPORT
, so this is a bit out of my comfort zone.
Thanks for any help. Please let me know if the question isn't clear.