I have a list with some fuzzy variables and a list with some clean variables. I want to want to (fuzzy) match both of them where the compged distance is lowest and some rules regarding their edit distances are satisfied (f.ex. compged < 100 and spedis < 50). I tried the following code
proc sql noprint;
create table out as
select min(compged(fuzzy.fuzzy_title,clean.cleaned_title,100))
as comp,
fuzzy.fuzzy_title, clean.cleaned_title
from fuzzy inner join clean
on (compged(fuzzy.fuzzy_title,clean.cleaned_title,100) < 100 and
spedis(clean.cleaned_title,fuzzy.fuzzy_title) < 50);
quit;
The datasets fuzzy and clean basically just contain the titles that I want to match. The code I use just gives me the minimum compged score of the whole dataset and then some arbitrary match where my condition regarding the distances are satisfied. Is there a way to choose exactly the clean_title with the minimum compged score for a given fuzzy_title? I might have searched wrong, but I couldn't find the answer to this.