1

Looking to automate some checks and print some warnings to a log file. I think I've gotten the general idea but I'm having problems generalising the checks.

For example, I have two datasets my_data1 and my_data2. I wish to print a warning if nobs_my_data2 < nobs_my_data1. Additionally, I wish to print a warning if the number of distinct values of the variable n in my_data2 is less than 11.

Some dummy data and an attempt of the first check:

%LET N = 1000;
DATA my_data1(keep = i u x n);
a = -1; 
b = 1;
max = 10;
do i = 1 to &N - 100;
  u = rand("Uniform");                     /* decimal values in (0,1)    */
  x = a + (b-a) * u;                       /* decimal values in (a,b)    */
  n = floor((1 + max) * u);                /* integer values in 0..max   */
  OUTPUT;
END;
RUN;

DATA my_data2(keep = i u x n);
a = -1; 
b = 1;
max = 10;
do i = 1 to &N;
  u = rand("Uniform");                     /* decimal values in (0,1)    */
  x = a + (b-a) * u;                       /* decimal values in (a,b)    */
  n = floor((1 + max) * u);                /* integer values in 0..max   */
  OUTPUT;
END;
RUN;

DATA _NULL_;
FILE "\\filepath\log.txt" MOD;
SET my_data1 NOBS = NOBS1 my_data2 NOBS = NOBS2 END = END;
IF END = 1 THEN DO;
PUT "HERE'S A HEADER LINE";
END;
  IF NOBS1 > NOBS2 AND END = 1 THEN DO;
  PUT "WARNING!";
  END;
IF END = 1 THEN DO;
PUT "HERE'S A FOOTER LINE";
END;
RUN;

How can I set up the check for the number of distinct values of n in my_data2?

Ross
  • 521
  • 1
  • 4
  • 16
  • 2
    Tip. If you use `put "WARNING: Whatever you like here";` then it will actually display in the SAS log in green like a regular warning message. Likewise if you swap out warning for error, it will show in red. – Robert Penridge Apr 24 '18 at 14:55

2 Answers2

2

A proc sql way to do it -

%macro nobsprint(tab1,tab2);
options nonotes; *suppresses all notes;

proc sql;
select count(*) into:nobs&tab1. from &tab1.;
select count(*) into:nobs&tab2. from &tab2.;
select count(distinct n) into:distn&tab2. from &tab2.;
quit;

%if &&nobs&tab2. < &&nobs&tab1. %then %put |WARNING! &tab2. has less recs than &tab1.|;

%if &&distn&tab2. < 11 %then %put |WARNING! distinct VAR n count in &tab2. less than 11|;

options notes; *overrides the previous option;    
%mend nobsprint;

%nobsprint(my_data1,my_data2);

This would break if you have to specify libnames with the datasets due to the .. And, you can use proc printto log to print it to a file.

For your other part as to just print the %put use the above as a call -

filename mylog temp;
proc printto log=mylog; run;

options nomprint nomlogic;
%nobsprint(my_data1,my_data2);

proc printto; run;

This won't print any erroneous text to SAS log other than your custom warnings.

samkart
  • 6,007
  • 2
  • 14
  • 29
  • Could you clarify the printing please? I don't want to print the whole SAS log to a file, just the things following a %PUT. PROC PRINTTO LOG = "myfilepath\log.txt"; does not print the warnings to the file. Additionally, I want to append to the log file, similar to how specifying the MOD option with a FILE statement would. – Ross Apr 24 '18 at 12:57
  • with `proc printto log` you can append the SAS log to your file. Unless you specify the option `new`, it won't create a new file. `PROC PRINTTO LOG = "myfilepath\log.txt"; run;` appends to the existing file, whereas `PROC PRINTTO LOG = "myfilepath\log.txt" new; run;` overwrites the existing file. – samkart Apr 24 '18 at 13:34
  • Additionally you can use `options nonotes` at the beginning of the macro to suppress all notes in your SAS log. So, things to read -- `options nonotes`, `proc printto log `. After the macro just include `options notes` to default all notes and warnings to SAS log – samkart Apr 24 '18 at 13:35
  • kindly refer [this](https://stackoverflow.com/questions/47567424/sas-ods-query-statement-print-along-with-its-output) for `proc printto log` info – samkart Apr 24 '18 at 13:40
0

@samkart provided perhaps the most direct, easily understood way to compare the obs counts. Another consideration is performance. You can get them without reading the entire data set if your data set has millions of obs.

One method is to use nobs= option in the set statement like you did in your code, but you unnecessarily read the data sets. The following will get the counts and compare them without reading all of the observations.

62         data _null_;
63         if nobs1 ne nobs2 then putlog 'WARNING: Obs counts do not match.';
64         stop;
65         set sashelp.cars nobs=nobs1;
66         set sashelp.class nobs=nobs2;
67         run;

 WARNING: Obs counts do not match.

Another option is to get the counts from sashelp.vtable or dictionary.tables. Note that you can only query dictionary.tables with proc sql.

floydn
  • 1,083
  • 7
  • 14
  • Hi @floydn, thanks for this! Interestingly this does stem from a load of work involving the dictionary.tables table - however, whenever I try to select nobs it takes an eternity to run! I try to only select libname, memname and nobs from dictionary.tables, and filter to the libraries I'm interested in (so roughly ~3000 tables total) and it takes too long. Any ideas? – Ross Apr 25 '18 at 12:37
  • 1
    Use SQL and query DICTIONARY.TABLES directly instead of using SASHELP.VTABLE view. SAS does not do a good job of pushing the where clause through when using the view. Also use constants in where clause. LIBNAME and MEMNAME are uppercase. `from dictionary.tables where libname='SASHELP' and memname in ('CARS','CLASS')` – Tom Apr 25 '18 at 12:50