7

What is the difference between IF and IF-THEN

For example the following statement

if type='H' then output;

vs

if type='H';
output;
Elvis
  • 255
  • 3
  • 11
  • The Comparisons section of the SAS documentation on [Subsetting IF Statement](http://support.sas.com/documentation/cdl/en/lestmtsref/67175/HTML/default/viewer.htm#p1cxl8ifdt8u0gn12wqbji8o5fq1.htm) gives details on how the subsetting 'IF' compares to the 'IF-THEN' and 'WHERE' statements. – Amir May 27 '14 at 17:21

2 Answers2

9

An if-then statement conditionally executes code. If the condition is met for a given observation, whatever follows the 'then' before the ; is executed, otherwise it isn't. In your example, since what follows is output, only observations with type 'H' are output to the data set(s) being built by the data step. You can also have an if-then-do statement, such as in the following code:

if type = 'H' then do;
i=1;
output;
end;

If-then-do statements conditionally execute code between the do; and the end;. Thus the above code executes i=1; and output; only if type equals 'H'.

An if without a then is a "subsetting if". According to SAS documentation:

A subsetting IF statement tests the condition after an observation is read into the Program Data Vector (PDV). If the condition is true, SAS continues processing the current observation. Otherwise, the observation is discarded, and processing continues with the next observation.

Thus if the condition of a subsetting if (ex. type='H') is not met, the observation is not output to the data set being created by the data step. In your example, only observations where type is 'H' will be output.

In summary, both of your example codes produce the same result, but by different means. if type='H' then output; only outputs observations where type is 'H', while if type='H'; output; discards observations where type is not 'H'. Note that in the latter you don't need the output; because there is an implicit output in the SAS data step, which is only overridden if there is an explicit output; command.

catquas
  • 712
  • 1
  • 5
  • 7
  • 5
    This is mostly correct. Subsetting `if` does not, technically, prevent output, however; and it does do more than simply prevent output in most situations. Specifically, subsetting `if` terminates the current iteration of the data step loop and returns to the top of the data step; similar to `if (...) then return`. This is important because anything that is after the subsetting `if` that is failed does not execute in that row. Subsetting `if` does prevent the automatic output (as it would prevent the non-automatic `output;` above) but doesn't affect earlier `output` statements. – Joe May 18 '14 at 19:48
4

They're similar but not identical. In a data step, if is a subsetting statement, and all records not satisfying the condition are dropped. From the documentation:

"Continues processing only those observations that meet the condition of the specified expression."

if then functions more like the if statement in other languages, it executes the statement after the then clause conditionally. A somewhat contrived example:

data baz;
set foo;
if type = 'H';
x = x + 1;
run;

data baz:
set foo;
if type='H' then x = x + 1;
run;

In both examples x will be incremented by 1 if type = 'H', but in the first data step baz will not contain any observations with type not equal to 'H'.

Nowadays it seems like most things that used to be accomplished by if are done using where.

James King
  • 6,229
  • 3
  • 25
  • 40