2

for example i have sentenes like this:

$text = "word, word w.d. word!..";

I need array like this

Array
(
    [0] => word
    [1] => word
    [2] => w.d
    [3] => word".
)

I am very new for regular expression..

Here is what I tried:

function divide_a_sentence_into_words($text){ 
    return preg_split('/(?<=[\s])(?<!f\s)\s+/ix', $text, -1, PREG_SPLIT_NO_EMPTY); 
}

this

$text = "word word, w.d. word!..";
$split = preg_split("/[^\w]*([\s]+[^\w]*|$)/", $text, -1, PREG_SPLIT_NO_EMPTY);
print_r($split);

works, but i have second question i want to write list in mu regular exppression "w.d" is special case.. for example this words is my list "w.d" , "mr.", "dr."

if i will take text:

$text = "word, dr. word w.d. word!..";

i need array:

Array (
  [0] => word
  [1] => dr.
  [2] => word
  [3] => w.d
  [4] => word 
)

sorry for bad english...

hippietrail
  • 15,848
  • 18
  • 99
  • 158
Guno
  • 31
  • 1
  • 1
  • 6
  • Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. – user229044 Aug 08 '13 at 18:55
  • What exactly is a "word"? How do you define, in English, what a "word" is? Before you can write a regular expression, you have to be able to describe, in English, the rules that you're trying to implement. – Andy Lester Aug 08 '13 at 18:56
  • **Show us what you've tried so far.** Don't describe it, but edit the question and paste in the actual code. Then tell us what didn't work. What happened when you tried it? Did you get incorrect results? Did you get *no* results? If the results were incorrect, what made them incorrect? What were you expecting instead? Did you get *any* correct results? If so, what were they? Don't make us guess. – Andy Lester Aug 08 '13 at 18:56
  • i try: function divide_a_sentence_into_words($text){ return preg_split('/(?<=[\s])(?<!f\s)\s+/ix', $text, -1, PREG_SPLIT_NO_EMPTY); } – Guno Aug 08 '13 at 18:59

3 Answers3

8

Using preg_split with a regex of /[^\w]*([\s]+[^\w]*|$)/ should work fine:

<?php
    $text = "word word w.d. word!..";
    $split = preg_split("/[^\w]*([\s]+[^\w]*|$)/", $text, -1, PREG_SPLIT_NO_EMPTY);
    print_r($split);
?>

DEMO

Output:

Array
(
    [0] => word
    [1] => word
    [2] => w.d
    [3] => word
)
h2ooooooo
  • 39,111
  • 8
  • 68
  • 102
  • yes this works, but i have second question i want to write list in mu regular exppression "w.d" is special case.. fore example this words is my list "w.d" , "mr.", "dr." if i will take text: $text = "word, dr. word w.d. word!.."; i need array: Array ( [0] => word [1] => dr. [2] => word [3] => w.d [3] => word ) – Guno Aug 08 '13 at 19:12
5

Use the function explode, that will split the string into an array

$words = explode(" ", $text);
Frank
  • 767
  • 5
  • 17
  • 2
    It looks like he wants to ignore the periods/punctuation at the ends of words. – Thomas Kelley Aug 08 '13 at 18:56
  • I understand it does not have enough content to reproduce, but the question did not have much information either so it's not that complex – Frank Aug 08 '13 at 19:01
  • this gives you last word: [4] => word!.. and second word will be [1] => word, – Guno Aug 08 '13 at 19:02
3

use

str_word_count ( string $string [, int $format = 0 [, string $charlist ]] )

see here http://php.net/manual/en/function.str-word-count.php it does exactly what you want. So in your case :

$myarray = str_word_count ($text,1);
scraaappy
  • 2,830
  • 2
  • 19
  • 29
  • 1
    see the doc, this method also returns each word in an array – scraaappy Aug 08 '13 at 19:01
  • 1
    If '.' is included in the $charlist argument, then it will be treated as part of a word; though a preg_split expression would be better because that could distinguish between `.` between characters and a `.` followed by a space – Mark Baker Aug 08 '13 at 19:54