16

Is there a way to check how many observations are in a SAS data set at runtime OR to detect when you've reached the last observation in a DATA step?

I can't seem to find anything on the web for this seemingly simple problem. Thanks!

chucknelson
  • 2,328
  • 3
  • 24
  • 31

4 Answers4

20

The nobs= option to a set statement can give you the number of observations. When the data step is compiled, the header portion of the input datasets are scanned, so you don't even have to execute the set statement in order to get the number of observations. For instance, the following reports 2 as expected:

/* a test data set with two observations and no vars */
data two;
  output;
  output;
run;

data _null_;
  if 0 then set two nobs=nobs;
  put nobs=;
run;
/* on log
nobs=2
*/

The end= option sets a flag when the last observation (for the set statement) is read in.

A SAS data set, however, can be a SAS data file or a SAS view. In the case of the latter, the number of observations may not be known either at compile time or at execution time.

data subclass/view=subclass;
  set sashelp.class;
  where sex = symget("sex");
run;

%let sex=F;
data girls;
  set subclass end=end nobs=nobs;
  put name= nobs= end=;
run;
/* on log
Name=Alice nobs=9.0071993E15 end=0
Name=Barbara nobs=9.0071993E15 end=0
Name=Carol nobs=9.0071993E15 end=0
Name=Jane nobs=9.0071993E15 end=0
Name=Janet nobs=9.0071993E15 end=0
Name=Joyce nobs=9.0071993E15 end=0
Name=Judy nobs=9.0071993E15 end=0
Name=Louise nobs=9.0071993E15 end=0
Name=Mary nobs=9.0071993E15 end=1
*/
Chang Chung
  • 2,307
  • 1
  • 17
  • 16
  • Thanks for the multitude of options in here, I ended up going w/ the END option for now. Great to see a whole variety of ways to do this :) – chucknelson Sep 29 '09 at 12:42
10

You can also use %sysfunc(attrn( dataset, nlobs)) though it is limited to SAS data sets (i.e. not data views). Credit for the macro to this SUGI paper, which also give great information regarding good macro design.

You can get all sorts of other character and numeric information on a SAS data set.

See the documentation on attrn and attrc.

%macro numobs (data=&syslast ) ;
/* --------------------------------------------
Return number of obs as a function
--------------------------------------------
*/
%local dsid nobs rc;
%let data = &data ; /* force evaluation of &SYSLAST */
%let dsid=%sysfunc(open(&data));
%if &dsid > 0 %then
%do ;
   %let nobs=%sysfunc(attrn(&dsid,nlobs));
   %let rc=%sysfunc(close(&dsid));
%end ;
%else
   %let nobs = -1 ;
&nobs
%mend numobs;
cmjohns
  • 4,465
  • 17
  • 21
7

Find the number of observations in a SAS data set:

proc sql noprint;
  select count(*) into: nobs
  from sashelp.class
  ;
quit;

data _null_;
  put "&nobs";
run;

The SQL portion counts the number of observaions, and stores the number in a macro variable called "nobs". The data step puts the number for display, but you can use the macro variable like any other.

Performing a certain action when the last observation is processed:

data _null_;
  set sashelp.class end=eof;
  if eof then do;
     put name= _n_=;
  end;
run;

The "end" option to the "set" statement defines a variable (here "eof" for end-of-file) that is set to 1 when the last observation is processed. You can then test the value of the variable, and perform actions when its value is 1. For more info, see the documentation for the "set" statement.

Martin Bøgelund
  • 1,681
  • 1
  • 17
  • 26
  • Thanks for this - having that macro variable option is definitely handy to know :) – chucknelson Sep 29 '09 at 14:10
  • The benefit of these approaches is that they are simple. The downside is that they are sloooow when run on datasets with a large number of rows as SAS actually needs to iterate through the dataset to get a result. A better technique is to query the metadata using a technique like cmjohns displays below. – Robert Penridge Jan 05 '12 at 09:40
  • @RobPenridge Querying Metadata is sure a good way to go about it. Be aware that not all data engines support it, so the most general approach is to count. Detecting last observation is often useful, and here a metadata query-approach would be less elegant, given the automatic eof variable. – Martin Bøgelund Jan 07 '12 at 18:54
  • @MartinBøgelund Good points, but provided your SAS environment is setup effectively I think a metadata approach is the most elegant. This could be in the form of a macro to retrieve the number of observations which can be written once and used anywhere... Depending on the user's situation, the macro could automatically check the engine and also select the best approach to query it. For example, a `select count(*) from table` in MYSQL automatically retrieves the result from MYSQL metadata rather than counting rows. – Robert Penridge Jan 08 '12 at 02:28
  • Also, does the `eof` approach work with datasets that contain 0 rows? I don't have SAS in front of me now so can't test it. – Robert Penridge Jan 08 '12 at 02:29
5
data hold;   
  set input_data end=last;    
    .    
    .   
    .   
  if last then do;   
    .   
    .   
    .   
  end;  
run;
Triad sou.
  • 2,969
  • 3
  • 23
  • 27