I have a file of "words" that is about 5.8 MB in size and has 560,000 words in it. I'm using it to get real words from strings that are joined together.
E.g. greenbananatruck could be such string.
I wrote this function to be used in very fast pace. But I can't get it to be faster then 0.5 sec. I'm using server with 8 core processor, 8GB RAM. Actually cpu is not a problem the problem is RAM. I need to be able to do this process quickly and efficiently at multiple instances.
public function wordSplitReal( $str ){
$words = array_filter( $this->dict, function($word) use(&$str) {
$pos = strpos( $str, $word );
if ( $pos !== false ){
$str = substr_replace($str, "", $pos, strlen($word));
return true;
}
return false;
} );
return $words;
}
It's very simple, what I'm actually doing is "filtering" the array "dict" to only the words that are in the given string. (I'm not interested in multiple words.) Dict is presorted from the longest to the shortest word. All in only lower letters. This func is part of bigger class using singleton.
Any help would be appreciated.