0

as we khow, we can find closest words by levenshtein for example:

<?php
$subj = "hello world";
$str = array();
$str[] = "hallo";
$str[] = "helo";

$minStr = "";
$minDis = PHP_INT_MAX;
foreach ($str as $curStr) {
    $dis = levenshtein($subj, $curStr);
    if ($dis < $minDis) {
        $minDis = $dis;
        $minStr = $curStr;
    }
}
echo($minStr);

output is: hallo, but i want to find closest correct word from incorrect words, for example between hallo and helo find hello as correct word in $subj, from for example dictionary and return in output.hallo and helo typed with end-user and hello saved as correct word on server

how can i do that?

DolDurma
  • 15,753
  • 51
  • 198
  • 377

1 Answers1

0

I think I understand your question.

Here I explode the subject and foreach the subjects and str words nested.
The return from levenhstein is placed in an array with the subject word first, then the "distance", then a subarray with all words that is that distance from the subject word.

$subj = "hello world";
$subj = explode(" ", "hello world");

$str = ["hallo", "helo", "aaahelojjjj", "pizza", "Manhattan"];

$minStr = "";
$minDis = PHP_INT_MAX;
foreach ($str as $curStr) {
    Foreach($subj as $word){
        $dis = levenshtein($word, $curStr);   
        $dist[$word][$dis][] = $curStr;
    }
}
// optional sort keys in subarrays 
foreach($dist as &$arr){
    ksort($arr);
}
unset($arr);
Var_export($dist);

output:

(unsorted)
array (
  'hello' => //word
  array (
    1 =>     // $key is levenhstein output (distance from word)
    array (  // values are the words that is $key distance from word 
      0 => 'hallo', //both these words are one from the word 'hello'
      1 => 'helo',
    ),
    8 => 
    array ( // these words are 8 from 'hello'
      0 => 'aaahelojjjj',
      1 => 'Manhattan',
    ),
    5 => 
    array (
      0 => 'pizza',
    ),
  ),
  'world' =>  // here is how far each word is from 'world'
  array (
    4 =>  
    array (
      0 => 'hallo', // both hallo and helo is 4 characters from 'world'
      1 => 'helo',
    ),
    10 => 
    array (
      0 => 'aaahelojjjj',
    ),
    5 => 
    array (
      0 => 'pizza',
    ),
    9 => 
    array (
      0 => 'Manhattan',
    ),
  ),
)

https://3v4l.org/OVp7J

Andreas
  • 23,610
  • 6
  • 30
  • 62
  • Thanks but i think its not correct, basically when we search with wrong words on google, that can be find correct words and search, could you understand me? – DolDurma May 23 '18 at 13:55
  • Yes I understand. But you need to get the lowest array value of each word subarray. https://3v4l.org/0aSIG – Andreas May 23 '18 at 14:18
  • that can be find closest on incorrect word not correct word closest on incorrect words, for example when user typed `hallo` this solution must be return `hello` on output, thats right? – DolDurma May 23 '18 at 14:29
  • It's all there you just have to put some effort in it yourself! You have not expressed anything about how you want your output in your question. So how am I supposed to know what you want? Last chance! I think I have spelled it out for you now, if this is not what you want then I give up. Because your question is not giving me any clues to what you want. https://3v4l.org/hB7Ge – Andreas May 23 '18 at 15:48