-1

I have a table that is being created, we will say that column 1 is YearMonth and column2 as Flag.

YearMonth Flag
200101    1
200102    1        
200103    0
200104    1
200105    1
200106    0
200107    1
200108    0

Note: First entry of flag column will always be 1.

I want to add 1 to flag if current flag doesn't match with previous row value along with one major condition (explained after output for a better explanation).

The output should be:

YearMonth Flag Stage
200101    1    1 
200102    1    1    
200103    0    2
200104    1    3 
200105    1    3
200106    0    4
200107    1    3 
200108    0    4

Please note there are only 4 stages. Hence if a flag is repeated after stage 4, then it should not increment and should give output as either stage=3 if flag=1 or stage=4 if flag=0.

I am trying something like this:


 data one;
 set Query;
 Stage=lag(flag);
 if first.flag then Stage=1;
 if first.flag then Stage=1;
 if flag ne Stage+1 then Stage=Stage+1;
 run;

An explanation of why this code isnt working would be really helpful. Thank you!

Also, I am aware that I am not doing something once it reaches stage 4.

2 Answers2

1

This is essentially counting groups of observations. So use BY group processing. Add the NOTSORTED keyword in the BY statement so SAS doesn't complain that the values aren't sorted. Increment the counter when starting a new group.

data want;
  set have;
  by flag notsorted;
  stage + first.flag;
run;

To add your second criteria you could just add this line

  stage = min(stage,4 - flag);
Tom
  • 47,574
  • 2
  • 16
  • 29
0

The stage value needs to be retained between iterations of the implicit loop. An implicit retain will accompany the use of the sum statement whose syntax is <variable>+<expression>;

Example:

data have; input 
YearMonth Flag; datalines;
200101    1
200102    1
200103    0
200104    1
200105    1
200106    0
200107    1
200108    0
;

data want(drop=maxstage_reached);
  set have;
  
  if flag ne lag(flag) then stage+1;                * increment per rules;

  if stage = 4 then maxstage_reached=1;             * tracking flag (retained);

  if maxstage_reached then stage = 4 - flag;        * special rule;

  retain maxstage_reached;
run;
Richard
  • 25,390
  • 3
  • 25
  • 38
  • Thanks! What if the first entry of flag column not equals to 1? Is there a way I can set something like: if first entry is not 1 (i.e. it will be 0) then stage will be assigned 2 and the same calculation follows it? –  Nov 03 '20 at 21:35
  • 1
    You can add a first if such as `if _n_=1 then stage = 1 - flag;` to set the initial value – Richard Nov 04 '20 at 01:32
  • Is there any reason for the code to not work if the same calculation is repeated for many different userids even if I am using "by" ? I am asking because I just realized for different userids different conditions exists. Sorry for not making this clear while asking the question. –  Nov 04 '20 at 17:22
  • Switch the best answer to @Tom, and use `by id flag notsorted; if first.id then call missing(stage);` to reset stage value at start of each id. – Richard Nov 05 '20 at 01:00