Is there a way in SAS to flag records that have values after a 'certain value' across columns?

Question

data have;
    length col1-col10 $2.;
    input col1-col10;
    datalines;
PM MM JM MM PM MM MM PB . .
MM JM PM MB . . . . . .
MM MM PM JM PB MM . . . .
PM PM PM MM MB MM JM . . .
;
run;

Goal: Hi all,

My goal here is that I want to flag all records which have values other than 'PB' or 'MB' after them in sequence across the columns.

Any help is appreciated. Thanks!

Expected output:

                           Flag
PM MM JM MM PM MM MM PB . .  0
MM JM PM MB . . . . . .      0
MM MM PM JM PB MM . . . .    1
PM PM PM MM MB MM JM . . .   1

Can you post your expected output? It's difficult to understand what you're looking for. — Stu Sztukowski, Mar 16 '23 at 17:22
Hi Stu, I have added my expected output. Let me know if you need more details. — Sai Paritala, Mar 16 '23 at 17:32

Richard · Answer 1 · 2023-03-17T15:35:21.663

After scratching the memory banks I remembered that naming syntax can list variables in reverse order!

I read the manuals way...

data work.want;
  set work.have;
  flag = cats(of col10-col1) not in: ('PB', 'MB');
run;

Seems like an obvious way...

If all rows have at least one PB or MB you can loop from end of array to 1 and flag if first value is not PB or MB.

If a row can be without either PB or MB you would have to check for that as well.

Example:

Leave checking loop as soon as first value encountered is evaluated for flag condition.

data want;
  set have;
  array col col1-col10;
  do i = dim(col) to 1 by -1;
    if not missing(col(i)) then do;
      flag = col(i) not in ('PB', 'MB');
      leave;
    end;
  end;

score 1 · Accepted Answer · answered Mar 16 '23 at 18:24

You can use one array and whichc to find the index of where MB or PB is in the columns and then you can find the first non missing column index and then make a flag off these two variables. This will work if all the missing is at the end (as coded in your example).

data want1;
set have;
array s(*)  col1-col10 ;
max=max(whichc('PB', of s[*]),whichc('MB', of s[*])) ;
max2=whichc(' ', of s[*])-1 ;
if max2 > max then fl = 1;
else fl =0;
run;

Is there a way in SAS to flag records that have values after a 'certain value' across columns?

2 Answers2