2

I hav especific need of removing word from string, But I am having problem when that word has dot (.) character.

Lets see here is the string and what I have tried so far?

$result = 'Hello Ka Kashish.';
$result = preg_replace('/\bKa\b/i', '', $result);

I will get the expected result 'Hello Kashish.'

But if the string is like below, It is not working

$result = 'Hello Ka. Kashish.';
$result = preg_replace('/\bKa.\b/i', '', $result);

It gives me result 'Hello Ka. Kashish.' Why this .(dot) is not working? Please give me solution.

And if I can achive this word removal in any other way, pLease let me know. I want to remove only word not set of charaters, as 'Ka' word will be removed, but 'Ka' will not be removed from 'Kashish'. Please help me.

Thanks in Advance

6 Answers6

1

This is because the dot can match any character.

The problem, too, is that \b really matches a word frontier, ie a position where a word character is followed by a non word character, or a non word character is followed by a word character. But as a dot is not a word character and neither is a space for that matter, it won't match.

Maybe you should try that instead:

preg_replace('/\bKa(\W|$)/i', '', $result)
fge
  • 119,121
  • 33
  • 254
  • 329
  • Yeah, true: `\b` will not match. In essence, you want to remove any punctuation character right after the word? – fge Jan 13 '12 at 13:16
  • so should I try like this? preg_replace('/\bKa\b\./i', '', $result); –  Jan 13 '12 at 13:18
  • Yes updated version is working, which characters will be matched with (\W|$))? –  Jan 13 '12 at 13:24
  • 1
    `\W` will match any character which is _not_ a word character (it is the exact opposite of `\w`), and `$` will match the end of the input. I had to group it using `(...)`, and `|` is an alternation (`re1|re2` means "match `re1` or `re2`). – fge Jan 13 '12 at 13:31
1

The reason is that \b represents a word boundary. I.e. a boundary between a word character and a non-word character. See http://www.regular-expressions.info/wordboundaries.html

The boundary between a full stop "." and a space " " is not a word boundary, so the pattern match fails. Neither "." nor a back-slashed "." will work. You need to remove the second "\b".

Separately, "." means "any character", so the purpose of using back-slash "." is to ensure it matches only a full-stop, as others have pointed out. This is important to note when re-designing your pattern to work without the second "\b".

Jules
  • 11
  • 2
  • removing second \b is working? it will not remove any non-word charaters, like 'ka' in 'kashish'. right? –  Jan 13 '12 at 13:20
  • It might do, depending on how you re-design your pattern. As others have said you may need to have a pattern which matches alternates. "Ka." or "Ka\b" or "Ka " or "Ka$" i.e. "Ka(\.|\b| |$)" – Jules Jan 13 '12 at 13:24
  • Others to try might be:- "Ka\.?(\b| |$)" or "Ka(\b|\.(\B|$)|$)" – Jules Jan 13 '12 at 13:32
  • I should just correct my last one, as it will leave in the "." because of the order of the alternates. Try "/(^|\b)Ka(\.(\b|\B|$)|\b|$)/i" – Jules Jan 13 '12 at 13:50
  • Sorry, a simplified version "/(^|\b)Ka(\.|\b|$)/i" where it is implicit that the "." alternate will be picked up first if available. – Jules Jan 13 '12 at 14:04
0

You need to escape the dot i.e. . instead of .

preg_replace('/\bKa\.\b/i', '', $result); 
Ed Heal
  • 59,252
  • 17
  • 87
  • 127
0

perhaps this will work the way you want it to?

preg_replace('/\bKa[\.]?(\s|$)/i', '', $result);
davogotland
  • 2,718
  • 1
  • 15
  • 19
0

Here is a lookahead based regex that will work for your case:

$result = 'Ka. Hello Ka. Kashish. Ka.';
$result = preg_replace('/(?<=\b)Ka\.(?=(\W|$))/i', '', $result);

OUTPUT:

string(17) " Hello  Kashish. "
anubhava
  • 761,203
  • 64
  • 569
  • 643
-1

rtrim is used to remove selected characters from right side of the string.

Here is an example of how to remove dot from the end of the sentence:

$result1=rtrim($result, '.');
echo $result1;
michaelbn
  • 7,393
  • 3
  • 33
  • 46