1

I want to sort results based on the number of matching words in two strings (case insensitive). Is there a way given two strings to count the number of matching words that appear in both strings?

Example:

"Red blue black green", "Green Black Blue"
-> 3
"Blue Green", "green blue"
-> 2
"Blue Green", "Red blue"
-> 1
"green blue black", "orange purple"
-> 0

I want to use this in an Order By clause. Given that a table has a column with a string in it, I'll query that table, then order the results based on the rows with the most matching words in the passed in string.

Peter R
  • 3,185
  • 23
  • 43

1 Answers1

1

You can split both phrases to words with regexp_split_to_table(), and then count the matches. A lateral join comes handy for this.

select t.*, x.cnt_matches
from (values 
    ('Red blue black green', 'Green Black Blue'),
    ('Blue Green',           'green blue'),
    ('Blue Green',           'red blue'),
    ('green blue black',     'orange purple')
) as t(str1, str2)
cross join lateral (
    select count(*) cnt_matches
    from regexp_split_to_table(lower(t.str1), ' ') w1(word)
    inner join regexp_split_to_table(lower(t.str2), ' ') w2(word)
        on w1.word = w2.word
) x

Demo on DB Fiddle:

str1                 | str2             | cnt_matches
:------------------- | :--------------- | ----------:
Red blue black green | Green Black Blue |           3
Blue Green           | green blue       |           2
Blue Green           | red blue         |           1
green blue black     | orange purple    |           0
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Is there a way to do this in a order by clause? In my table there's a field that has a string, and when I do my query I want to pass a string in, and then order the results by the number of matching words on the table's column. – Peter R Nov 02 '20 at 11:19
  • @PeterR: that's the same logic. You can use `cnt_matches` in the `order by` clause as well. – GMB Nov 02 '20 at 11:20