0

Is it possible for levenshtein search to check all words in a search query against an array?

The code is as follows:

        $input = $query;

    // array of words to check against
    $words  = $somearray;

    // no shortest distance found, yet
    $shortest = -1;

    // loop through words to find the closest
    foreach ($words as $word) {

        // calculate the distance between the input word,
        // and the current word
        $lev = levenshtein($input, $word);

        // check for an exact match
        if ($lev == 0) {

            // closest word is this one (exact match)
            $closest = $word;
            $shortest = 0;

            // break out of the loop; we've found an exact match
            break;
        }

        // if this distance is less than the next found shortest
        // distance, OR if a next shortest word has not yet been found
        if ($lev <= $shortest || $shortest < 0) {
            // set the closest match, and shortest distance
            $closest  = $word;
            $shortest = $lev;
        }
    }

            if ($shortest == 0) {
      echo "Exact match found: $closest\n";
       } else {
         echo "Did you mean: $closest?\n";
        }

In this one it considers maybe only the first word or the whole sentence as the string to be matched with array. How it possible to get the result and display the whole sentence with the corrected words?

Javier Brooklyn
  • 624
  • 3
  • 9
  • 25
  • Can you give an example value for $input as it is right now? – Ynhockey Feb 03 '13 at 13:48
  • Example of $input = 'The quick brown fox jumps over lazy dog' is a sentence so with the above code, only the first word or maybe the whole sentence is used to find the closest match but i require each of theses words to have the correction in the sentence. – Javier Brooklyn Feb 03 '13 at 14:24
  • what are you trying to achieve with this? Correct grammatical mistakes or find out missing words? – जलजनक Feb 03 '13 at 15:33
  • this is used to correct wrong words while searching like "did you mean - someword" – Javier Brooklyn Feb 03 '13 at 17:53

1 Answers1

0

OK based on what I understand now from your question, first you need to split the sentence into words, for example like this: How can I convert a sentence to an array of words?

After that you can compare each word to your dictionary, by looping through the first array and inside that through the second array, for instance:

foreach ($words as $word)
{
    $min_distance = strlen($word); // use mb_strlen() for non-Latin
    foreach ($dictionary as $new_word)
    {
        $dist = levenshtein($word, $new_word);
        if (($dist < $min_distance) and ($dist > -1))
        {
            $min_distance = $dist;
            $suggestion = $new_word;
        }
    }
}

Then, if the distance is greater than 0, suggest the $suggestion.

Note that this is actually very inefficient! It runs at Θ(n*m), assuming that levinshtein() runs at O(1), because you need to loop through the entire dictionary for every single word. You probably want to find out how these things are designed in real life, from a conceptual point of view, or at least offer suggestions for only the longer words and looping through more relevant parts of the dictionary.

Community
  • 1
  • 1
Ynhockey
  • 3,845
  • 5
  • 33
  • 51
  • Well the code would work but just as a sentence is wrong, is it possible to display the corrected words in the same sentence like for the incorrect sentence "Helllo evryone" it should display "Hello everyone", in the question above it refers to the place where it is written echo - "Did you mean: $closest?\n"; So would that loop or how is that done? – Javier Brooklyn Feb 04 '13 at 15:23
  • What you're asking to do is possible, just make a new sentence variable and concatenate it with either the word or the suggestion after the internal foreach loop. – Ynhockey Feb 05 '13 at 10:40
  • well it would be good if you can give me an idea of it because i am not very good at loops. – Javier Brooklyn Feb 05 '13 at 12:50