I am using Levenshtein distance which is a string metric for measuring the amount of difference between two sequences to find the percent of difference between two strings. I want to use a better method to declare the strings are similar using words in the strings.
For example: Lets say I have a string with 2 paragraphs and the second string only contains the second paragraph of the first string.
I know I could compare the first word of each strings and then the second etc but that wouldn't be effective if a case like the last example I presented happens.
I was thinking maybe comparing the first word in the first string with all of the words of the second string but I am afraid this is would make the process very slow.