in SAS how to to use a contain (or alternative) operator when you have more than one set of letters to choose. eg where have_variable= abd, afg, afd, acc and want_variable=abd, afg, afd (containing ab or af only)
Asked
Active
Viewed 119 times
3 Answers
2
I've Split your have and want list into two tables with multiple records then left joined on Have list to find the matching ones.
The final table will look like this
/* Create your input String */
data Have;
have="abd , afg , afd , acc";
run;
data Want ;
want="abd , afg , afd";
run;
/* Splint Input strings into Multiple Rows */
data Have_List;
set Have;
do i=1 by 0;
source=lowcase(scan(have,i,','));
if missing(source) then leave;
output;
i+1;
end;
keep source ;
run;
data Want_List;
set Want;
do i=1 by 0;
lookup=lowcase(scan(want,i,','));
if missing(lookup) then leave;
match='match';
output;
i+1;
end;
keep lookup match;
run;
/* Create a SQL left join to lookup the matching values */
proc sql;
create table match as
select h.source as have , COALESCE(w.match,"no-match") as match
from have_list h left join want_list w on h.source=w.lookup;
quit;

momo1644
- 1,769
- 9
- 25
0
You can use a list in your select statement.
Like that :
proc sql;
select * from my_table where have_variable in ('abd','afg','afd','acc') and want_variable in ('abd','afg','afd');
run;
quit;
You can even use the in operator in a dataset statement like this :
data want;
set mydate;
if have_variable in ('abd','afg','afd','acc') and
want_variable in ('abd','afg','afd');
run;
If you want to obtain the variable only containing 2 letters you can use the LIKE :
proc sql;
select * from my_table where have_variable like '%ab%' or have_variable like '%af%';
run;
in a dataset :
data want;
set mydate;
where have_variable like '%ab%' or
have_variable like '%af%';
run;
Regards

Thogerar
- 339
- 1
- 7
0
If you only want records that begin with ab or af (rather than contains them anywhere in the string), then you can you in
followed by :
. With this usage, the colon instructs SAS to only search the first n letters in the string, where n is length of the comparison (2 in your example).
Note that this only works in a datastep, not proc sql
.
data have;
input have_var $;
datalines;
abd
afg
afd
acc
;
run;
data _null_;
set have;
where have_var in: ('ab','af');
put _all_;
run;

Longfish
- 7,582
- 13
- 19