0

I work with big tables with hundreds of columns. And I keep on running into a problem where multiple columns used as flags (having either a value of 1 or missing) need to be checked for the same condition. For a simplified example,

proc sql;
    create table Applications as
    select t1.ApplicantID
         , t1.ApplicationDate
         , t2.Segment
    from Applicants as t1 inner join
         Segmentation as t2 ON t1.ApplicantID = t2.ApplicantID
    where t1.ApplicantID ^= ''
          AND (t1.TR_flag = '') AND (t1.TL_flag = '') 
          AND (t1.RU_flag = '') AND (t1.RP_flag = '') 
          AND (t2.OP_flag = '') AND (t2.OU_flag = '')
          AND (t2.GG_flag = '') AND (t2.GK_flag = '')
    order by t1.ApplicantID
;
quit;

Hence, I was wondering if there is any clean and short way to check for this condition for all the flag variables in one step. I do realize I could go for a loop with an array option but it seems too complicated for such a simple problem. I could also concatenate all the variables into one string and use the condition on the new string but this solution is not universal. Basically what I am looking for is something like the IN operator but going into an opposite direction, for imaginary example,

    where t1.ApplicantID ^= ''
          AND (t1.TR_flag, t1.TL_flag, t1.RU_flag, t1.RP_flag,
               t2.OP_flag, t2.OU_flag, t2.GG_flag, t2.GK_flag) = ''

which would be easier on the eye and in best case scenario faster to run.

Krans
  • 3
  • 4

1 Answers1

0

You can use CMISS, along with the number of arguments you require to be missing:

CMISS (t1.TR_flag, t1.TL_flag, t1.RU_flag, t1.RP_flag,
               t2.OP_flag, t2.OU_flag, t2.GG_flag, t2.GK_flag) = 8

For specific values, and not interfering with the join:

(t1.TR_flag = t1.TL_flag = t1.RU_flag = t1.RP_flag = '1'
AND
t2.OP_flag = t2.OU_flag = t2.GG_flag = t2.GK_flag = '1')

Still, I can't see this being any faster to run.

mjsqu
  • 5,151
  • 1
  • 17
  • 21
  • Great solution for my specific problem, thanks. However, it wouldn't work in situations where one would be dealing with flags of 1s and 0s or any other non missing value. Have any ideas for that? – Krans Nov 10 '15 at 10:37
  • For 0s and 1s you can use MAX or MIN depending on your needs. – vasja Nov 10 '15 at 10:54
  • Only if they are 0 and 1 (numeric), not '0' and '1' (character), or you'll see: `ERROR: Function MIN requires a numeric expression as argument` – mjsqu Nov 10 '15 at 11:02