In writing a very simple DATA step to start a new project, I ran across some strange behavior.
The only difference between set1 and set2 is the use of the variable lagscore in the equation in set1 vs. dummy in the equation in set2.
set1 produces output that appears to indicate that including lagscore causes the score and lagscore variables to be undefined in half of the iterations.
Note that I was careful to NOT call lag() more than once and I include the call in set2 just to make sure that the lag() function call is not the source of the problem.
I appreciate any explanations. I've been away from SAS for quite awhile and I sense that I am missing something fundamental in how the processing occurs.
(Sorry for the difficult to read output. I could not figure out how to paste it and retain spacing)
data set1;
obs=1;
score=500;
a_dist = -5.0;
b_dist = 0.1;
dummy = 0;
output;
do obs = 2 to 10;
lagscore = lag(score);
score = lagscore + 1 /(b_dist * lagscore + a_dist);
output;
end;
run;
data set2;
obs=1;
score=500;
a_dist = -5.0;
b_dist = 0.1;
dummy = 0;
output;
do obs = 2 to 10;
lagscore = lag(score);
/* score = lagscore + 1 /(b_dist * lagscore + a_dist);*/
score = dummy + 1 /(b_dist * dummy + a_dist);
output;
end;
run;`
Set1 results
obs score a_dist b_dist dummy lagscore
1 500 -5 0.1 0 .
2 . -5 0.1 0 .
3 500.02 -5 0.1 0 500
4 . -5 0.1 0 .
5 500.04 -5 0.1 0 500.02
6 . -5 0.1 0 .
7 500.06 -5 0.1 0 500.04
8 . -5 0.1 0 .
9 500.08 -5 0.1 0 500.06
10 . -5 0.1 0 .
Set2 results
obs score a_dist b_dist dummy lagscore
1 500 -5 0.1 0 .
2 -0.2 -5 0.1 0 .
3 -0.2 -5 0.1 0 500
4 -0.2 -5 0.1 0 -0.2
5 -0.2 -5 0.1 0 -0.2
6 -0.2 -5 0.1 0 -0.2
7 -0.2 -5 0.1 0 -0.2
8 -0.2 -5 0.1 0 -0.2
9 -0.2 -5 0.1 0 -0.2
10 -0.2 -5 0.1 0 -0.2