0

I am trying to search for a keyword in a description field (descr) and if it is there define that field as a match (what keyword it matches on is not important). I am having an issue where the do loop is going through all entries of the array and . I am not sure if this is because my do loop is incorrect or because my index command is inocrrect.

data JE.KeywordMatchTemp1;
  set JE.JEMasterTemp;
  if _n_ = 1 then do;
    do i = 1 by 1 until (eof);
    set JE.KeyWords end=eof;
    array keywords[100] $30 _temporary_;
    keywords[i] = Key_Words;
  end;
  end;
  match = 0;
  do i = 1 to 100 until(match=1);
    if index(descr, keywords[i]) then match = 1;
  end;
  drop i;
run;
  • Sure, it's going through all entries of the array, because that's what you told it to do. What do you want it to do? Are you looking to exit the loop prematurely if a match is found? – Joe Nov 14 '16 at 17:45
  • Oops, edited the code, must have had an old version in my clipboard. I have a do until rather than a do. – Tom Anderson Nov 14 '16 at 20:14

2 Answers2

1

Add another condition to your DO loop to have it terminate when any match is found. You might want to also remember how many entries are in the array. Also make sure to use INDEX() function properly.

data JE.KeywordMatchTemp1;
  if _n_ = 1 then do;
    do i = 1 by 1 until (eof);
      set JE.KeyWords end=eof;
      array keywords[100] $30 _temporary_;
      keywords[i] = Key_Words;
    end;
    last_i = i ;
    retain last_i ;
  end;
  set JE.JEMasterTemp;
  match = 0;
  do i = 1 to last_i while (match=0) ;
    if index(descr, trim(keywords[i]) ) then match = 1;
  end;
  drop i last_i;
run;
Tom
  • 47,574
  • 2
  • 16
  • 29
  • So this partially works, however it still never matches (I have checked the fields and they definitely have keywords). When I don't drop i and last i, every row has 46 and 47 as values. – Tom Anderson Nov 14 '16 at 20:25
  • 1
    So your question is really how to use the INDEX() function? Most likely your KEYWORDS are shorter than 30 characters and so they are not matching since the extra spaces are not in DESCR variable value. Add TRIM() or use another function like FINDW() instead. – Tom Nov 14 '16 at 20:44
  • Trim did it!! Thank you very much. – Tom Anderson Nov 14 '16 at 20:57
0

You have two problems; both of which would be easy to see in a small compact example (suggestion: put an example like this in your question in the future).

data partials;
  input keyword $;
  datalines;
home
auto
car
life
whole
renter
;;;;
run;

data master;
  input @1 description $50.;
  datalines;
Mutual Fund
State Farm Automobile Insurance
Checking Account
Life Insurance with Geico
Renter's Insurance
;;;;
run;

data want;
  set master;
  array keywords[100] $ _temporary_;
  if _n_=1 then do;
    do _i = 1 by 1 until (eof);
      set partials end=eof;
      keywords[_i] = keyword;
    end;
  end;
  match=0;
  do _m = 1 to dim(keywords) while (match=0 and keywords[_m] ne ' ');
    if find(lowcase(description),lowcase(keywords[_m]),1,'t') then match=1;
  end;
run;

Two things to look at here. First, notice the addition to the while. This guarantees we never try to match " " (which will always match if you have any spaces in your strings). The second is the t option in find (I note you have to add the 1 for start position, as for some reason the alternate version doesn't work at least for me) which trims spaces from both arguments. Otherwise it looks for "auto " instead of "auto".

Joe
  • 62,789
  • 6
  • 49
  • 67