Grouping plots and tables using a BY statement in SAS

Question

I am using a BY statement with both proc boxplot and proc report to create a plot and a table for each level of the BY variable. As is, the code prints all the plots and then prints all of the tables. I would like it to print the plot and then the table for each level of the By variable (so the ouput would alternate between a plot and a table). Is there a way to do this?

This is the code I currently have for the plots and tables-

proc boxplot data=study;
plot Lead_Time*Study_ID/ horizontal;
by Project_Name;
format Lead_Time dum.;
run;

proc report data=study nowd;
column ID Title Contact Status Message Audience Priority;
by Project_Name;
run;

Thank You!!

score 1 · Answer 1 · edited May 23 '17 at 12:11

1

Unfortunately, I don't think the ODS (Output Delivery System) can interleave outputs from procedures. You will need to use a macro to loop over all the by variables and call BOXPLOT and REPORT for each one.

Something like this:

%macro myreport();
%let byvars = A B C D;
%let n=4;

%do i=1 %to &n;
   %let var = %scan(&byvars,&i);
   proc something data=have(where=(byvar="&var"));
   ...;
   run;

   proc report data=have(where=(byvar="&var"));
   ....
   run;
%end;
%mend;

%myreport();

Obviously you need to change this to fit your needs. There are plenty of examples on Stackoverflow of it. Here is one: looping over character values in SAS

edited May 23 '17 at 12:11

Community

1
1

answered Aug 07 '14 at 18:44

DomPazz

12,415
17
23

Gah, never never never loop like this. Pull the macro calls from the data. – Joe Aug 07 '14 at 18:56
Or at least pass the list of byvars as a parameter to the macro. Then you can decide later whether you want to type the list or generate it. – Quentin Aug 07 '14 at 19:33
No no no. The macro should take [1] byvar as a parameter, and then generate calls to macro. No looping off macro variables. Makes my head hurt. – Joe Aug 07 '14 at 19:42
Guess different things hurt different heads. CALL EXECUTE can get ugly too, especially giving the timing issues. Or perhaps you like a different way to generate the calls? Then again, plenty of folks say to just avoid the macro language due to head hurt. sigh. – Quentin Aug 07 '14 at 19:50
`proc sql; select cats('%macro_name(',byval,')') into ... ` is my preferred way, to avoid timing issues like you say. – Joe Aug 07 '14 at 20:01
if I want to generate a list for the byvars and then pass it as a parameter to the macro what would that look like? Thank you so much for the input! – Ellie Aug 07 '14 at 20:49
Different problems have different solutions. I think for a handful of known fixed parameters (ie. by-vars) like we have above, the best way is to call the macro 4 times, each time passing in a single by-var as Joe mentioned. However, if it was an unknown number of by-vars, then looping is the best approach. To simplify the looping I recommend developing your own set of utility functions to simplify the calls and reduce any 'headaches'. – Robert Penridge Aug 07 '14 at 20:55
Looping is no better - in fact worse - when the number aren't known. You can do the same thing with a data extract. I'll explain in a separate answer. – Joe Aug 07 '14 at 21:16

score 1 · Answer 2 · answered Aug 07 '14 at 19:33

This is in principle possible using PROC DOCUMENT and the ODS DOCUMENT output type. It's not exactly easy, per se, but it's possible, and has some advantages over the macro option, although I'm not sure sufficient to recommend its use. However, it's worth exploring nonetheless.

First off, this is largely guided (including, coincidentally, using the same dataset!) by Cynthia Zender's excellent tutorial, Have It Your Way: Rearrange and Replay Your Output with ODS DOCUMENT, presented during the 2009 SAS Global Forum. She initially describes a GUI method of doing this, but then later explains it in code, which would clearly be superior for this sort of thing. Kevin Smith covers similar ground in ODS DOCUMENT From Scratch, from 2012's SGF, though Cynthia's paper is a bit more applicable here (as she covers the exact topic).

First, you need to generate all of your results. Order here doesn't matter too much. I generate a sample of SASHELP.PRDSALE that is sorted appropriately by country.

proc sort data=sashelp.prdsale out=prdsale;
by country;
run;

Then, we generate some tables; a proc means and a sgplot. Note the title uses #BYVAL1 to make sure the title is included - otherwise we lose the useful labels on the procs!

title "#BYVAL1 Report";


ods _all_ close;
ods document name=work.mydoc(write);
proc means data=prdsale sum;
 by country;
 class quarter year;
 var predict;
run;

proc sgplot data=prdsale;
 by country;
 vbar quarter/response=predict group=year groupdisplay=cluster;
run;

ods document close;
ods preferences;

Now, we have something that is wrong, but is usable for what you actually want. You can use the techniques in Cynthia or Kevin's papers to look into this in detail; for now I'll just go into what you need for this purpose.

It's now organized like this, imagining a folder tree:

\REPORT\MEANS\COUNTRY\

What we need is:

\REPORT\COUNTRY\MEANS

That's easy enough to do. The code to do so is below. Obviously, for a production process this would be better automated; given the input dataset it should be trivial to generate this code. Note that the BYVALs increment for each by value, so CANADA is 1 and 4, GERMANY is 2 and 5, and USA is 3 and 6.

proc document name=work.mydoc_new(write);
 make CANADA, GERMANY, USA;   *make the lower level folders;
 run;

 dir ^^;  *Go to the bottom level, think "cd .." in unix/windows;
 dir CANADA;  *go to Canada folder;
 dir;         *Notes to the Listing destination where we are, not that important;
 copy \work.mydoc\Means#1\ByGroup1#1\Summary#1 to ^;  *copy that folder from orig doc to here;
 copy \work.mydoc\SGPlot#1\ByGroup4#1\SGPlot#1 to ^; *^ being current directory, like '.' in unix/windows;

*You could also copy \ByGroup1#1 and \Bygroup4#1 without the last level of the tree. That would give a slightly different result (a bit more of the text around the table would be included), so do whichever matches your expectations.;

**Same for Germany and USA here. Note that this is the part that would be easy to automate!; dir ^^; dir GERMANY; dir; copy \work.mydoc\Means#1\ByGroup2#1\Summary#1 to ^; copy \work.mydoc\SGPlot#1\ByGroup5#1\SGPlot#1 to ^;

 dir ^^;
 dir USA;
 dir;
 copy \work.mydoc\Means#1\ByGroup3#1\Summary#1 to ^;
 copy \work.mydoc\SGPlot#1\ByGroup6#1\SGPlot#1 to ^;



run;
quit;  *this is one of those run group procedures, need a quit;

Now, you only have to replay the document to get it out the right way.

proc document name=mydoc_new;
 replay;
 run;
quit;

Tada, you have what you want.

score 1 · Answer 3 · answered Aug 07 '14 at 21:18

1

If you're going to run the procs once per by value, that's pretty easy. Create a macro to run just one instance, then use proc sql to create a call for each instance. That is entirely dynamic, and could be easily adjusted to allow for other options such as multiple by variables, levels, etc.

Given a single by value:

*Macro that runs it once;
%macro run_reports(project_name=);
  title "Report for &project_name.";
  proc boxplot data=study;
   plot Lead_Time*Study_ID/ horizontal;
   where Project_Name="&project_name.";
   format Lead_Time dum.;
  run;

  proc report data=study nowd;
   column ID Title Contact Status Message Audience Priority;
   where Project_Name="&project_name.";
  run;
%mend run_Reports;

*SQL pull to create a list of macro calls;
proc sql;
select distinct cats('%run_Reports(project_name=',project_name,')')
  into :runlist separated by ' '
  from study;
quit;

&runlist.;

Turn options symbolgen; on to see what the runlist looks like, or look at your output window (or results window in 9.3+). When you're running this in production, add noprint to proc sql to avoid generating that table.

answered Aug 07 '14 at 21:18

Joe

62,789
6
49
67

I'm familiar with this approach, but I'm still curious as to why you believe looping is worse as you didn't elaborate on that part of your comment in this answer. Perhaps you could edit it to clarify? Looping seems like the natural solution to me when dealing with an unknown number. It has the benefits of simple debugging as well as flexibility while keeping readability (if you want to do more than just call a macro). – Robert Penridge Aug 07 '14 at 23:48
On the contrary; this method is much easier to debug. It's basic modular programming: write a module that executes 1 iteration/step/whatever, then write a module to call that however many times you need. A macro loop is going to be much harder to debug, because you can't as easily run just one iteration (without changing the code, definitely). Here you have a single iteration in a single macro, so you can run that to test the functionality; and you call the macro directly from the data. It removes an unneeded step, basically, which is the macro loop. – Joe Aug 08 '14 at 07:22
Further, this is more readable, as long as you behave with names, since you can separate the macro from the data pull that defines the call. – Joe Aug 08 '14 at 07:23
Just think about it: the same macro loop is the same code, PLUS a macro loop, right? You need the `proc sql` to create the single macro parameter to pass into the macro, you need the above macro for the inner loop of the macro loop, PLUS you add some additional code to run the loop itself, to parse the macro variable, and to properly parametrize the inner loop. It's five or six extra lines of code, plus needing to understand the macro language fairly well. This solution requires very little understanding of macros, less code, and is more modular. – Joe Aug 08 '14 at 07:26
Ok, makes sense and I see where you are coming from now. I'm not convinced on the point you make about not being able to run a single iteration. I believe this is entirely dependent on implementation - for example, if I keep your `%run_reports` macro, and just call it from inside of a loop, I can also call a single execution of it by testing it outside of the loop - exactly the same way you would do it. – Robert Penridge Aug 08 '14 at 14:36
I also don't see having an additional SQL step as a bad thing. Personally I would use the dataset that it creates as an additional debugging resource that shows you what will be looped over (or run). One downside of the above is that you can't see what is going to be run without changing the code. I guess you could use a %put but that could get hard to read... I do agree that unless utility macros are created then yes, looping will be more difficult, but unfortunately SAS did not have the foresight to supply these kinds of useful tools with their software... – Robert Penridge Aug 08 '14 at 14:44
1

You can see what's going to be run by leaving the `noprint` off the sql statement. I usually leave it off until we hit production, as it means I can grab any one iteration's macro call and run it. – Joe Aug 08 '14 at 14:58

Grouping plots and tables using a BY statement in SAS

3 Answers3