0

I am working in SAS Enterprise guide and am running a proc sql query as follows:

proc sql; 
    CREATE TABLE average_apples AS
    SELECT farm, size, type, mean(apples) as average_apples
    FROM input_table
    GROUP BY farm, size, type
    ;
quit; 

For some of the data sets I am running this query on there are groups which have no observations assigned to them, so there is no entry for them in the query output.

How can I force this query to return a row for each of my groups (for example with a value of 0in the apples column?

Thanks up front for the help!

Martin Reindl
  • 989
  • 2
  • 15
  • 33
  • It will be easier to answer if you provide some example source data. It looks like you have a data set that stores the value of fruit (e.g. apples) in separate columns rather than having a column that identifies the fruit and a separate value column (which is a more normal structure). Are you saying that for some data sets the column referenced in the query doesn't exist? – Longfish Mar 10 '17 at 09:10
  • 1
    Do you have a dataset that has all groups (i.e. All combinations of farm-size-type) that you would want in the output? If so, you can join that to your output table. – Quentin Mar 10 '17 at 09:53

2 Answers2

0

I'd do this:

/* sample input table */
data input_table;
length farm size type $3 apples 8;
stop; /* try also with this statement commented out 
         to check the result for non-empty input table */
run;

proc sql; 
    CREATE TABLE average_apples AS
    SELECT farm, size, type, mean(apples) as average_apples
    FROM input_table
    GROUP BY farm, size, type
    ;
quit;

%let group_rows = &SQLOBS;
%put &group_rows;

data average_apples_blank;
if &group_rows ne 0 then set average_apples(obs=0);
else do;
   array zeros {*} _numeric_ /* or your list of variables */;
   do i=1 to dim(zeros);
      zeros[i] = 0;
   end;
   output; /* empty row */
end;
drop i;
run;


proc append base=average_apples data=average_apples_blank force;
run;
vasja
  • 4,732
  • 13
  • 15
0

Try this

proc sql;
select f.farm, s.size, t.type, coalesce(mean(apples), 0) as average_apples
from (select distinct farm from input_table) as f
   , (select distinct size from input_table) as s
   , (select distinct type from input_table) as t
left join input_table as i
  on i.farm = f.farm and i.size = s.size and i.type t.type;
quit;

I did not test it, though. It it does not work, put this in a comment and I will debug it.

Dirk Horsten
  • 3,753
  • 4
  • 20
  • 37
  • Hi Dirk. When I run this, I get the following error: `ERROR: correlated reference to column farm is not contained within the subquery`. – Martin Reindl Mar 20 '17 at 16:25