Questions tagged [word-boundary]

A word boundary is the regular expression construct (\b) that denotes a word boundary which indicates a pointer position that is ahead of and behind a word character and a non-word character or the other way around (\w\W or \W\w), and vice-versa for non-word boundaries (\B).

A word boundary is the regular expression construct \b which allows asserting whether the current regex match pointer is in a word boundary.

It denotes a word boundary which indicates a pointer position that is ahead of and behind a word character and a non-word character or the other way around (\w\W or \W\w)

Non-word boundaries \B, on the other hand, denotes a pointer position which is both ahead of and behind of word characters, or non-word characters. (\W\W or \w\w)

175 questions

215

votes

13 answers

What is a word boundary in regex?

I'm trying to use regexes to match space-separated numbers. I can't find a precise definition of \b ("word boundary"). I had assumed that -12 would be an "integer word" (matched by \b\-?\d+\b) but it appears that this does not work. I'd be…

regex word-boundary

asked Aug 24 '09 at 20:46

peter.murray.rust

37,407
44
153
217

143

votes

7 answers

Regex match entire words only

I have a regex expression that I'm using to find all the words in a given block of content, case insensitive, that are contained in a glossary stored in a database. Here's my pattern: /($word)/i The problem is, if I use /(Foo)/i then words like…

regex word-boundary

asked Nov 17 '09 at 19:49

Aaron

1,617
4
13
7

votes

3 answers

PostgreSQL Regex Word Boundaries?

Does PostgreSQL support \b? I'm trying \bAB\b but it doesn't match anything, whereas (\W|^)AB(\W|$) does. These 2 expressions are essentially the same, aren't they?

regex postgresql word-boundary

asked Sep 29 '10 at 20:41

mpen

272,448
266
850
1,236

votes

2 answers

How to use grep()/gsub() to find exact match

string = c("apple", "apples", "applez") grep("apple", string) This would give me the index for all three elements in string. But I want an exact match on the word "apple" (i.e I just want grep() to return index 1).

r regex word-boundary

asked Nov 08 '14 at 04:15

Adrian

9,229
24
74
132

votes

2 answers

What are non-word boundary in regex (\B), compared to word-boundary?

javascript regex word-boundary boundary word-boundaries

asked Dec 27 '10 at 20:25

DarkLightA

14,980
18
49
57

votes

3 answers

Oracle REGEXP_LIKE and word boundaries

I am having a problem with matching word boundaries with REGEXP_LIKE. The following query returns a single row, as expected. select 1 from dual where regexp_like('DOES TEST WORK HERE','TEST'); But I want to match on word boundaries as well. So,…

regex oracle word-boundary

asked Sep 27 '11 at 10:32

Greg Reynolds

9,736
13
49
60

votes

7 answers

How to match the first word after an expression with regex?

For example, in this text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc eu tellus vel nunc pretium lacinia. Proin sed lorem. Cras sed ipsum. Nunc a libero quis risus sollicitudin imperdiet. I want to match the word after 'ipsum'.

regex lookbehind word-boundary

asked Feb 13 '09 at 14:52

Matthew Taylor

3,911
4
29
33

votes

5 answers

utf-8 word boundary regex in javascript

In JavaScript: "ab abc cab ab ab".replace(/\bab\b/g, "AB"); correctly gives me: "AB abc cab AB AB" When I use utf-8 characters though: "αβ αβγ γαβ αβ αβ".replace(/\bαβ\b/g, "AB"); the word boundary operator doesn't seem to work: "αβ αβγ γαβ αβ…

javascript regex unicode utf-8 word-boundary

asked May 21 '10 at 11:01

cherouvim

31,725
15
104
153

votes

3 answers

Javascript - regex - word boundary (\b) issue

I have a difficulty using \b and greek characters in a regex. At this example [a-zA-ZΆΈ-ώἀ-ῼ]* succeeds to mark all the words I want (both greek and english). Now consider that I want to find words with 2 letters. For the English language I use…

javascript regex word-boundary

asked May 04 '14 at 16:50

tgogos

23,218
20
96
128

votes

4 answers

MySQL REGEXP word boundaries [[:<:]] [[:>:]] and double quotes

I'm trying to match some whole-word-expressions with the MySQL REGEXP function. There is a problem, when there are double quotes involved. The MySQL documentation says: "To use a literal instance of a special character in a regular expression,…

mysql regex word-boundary

asked Sep 19 '13 at 17:52

henk

votes

3 answers

A Viable Solution for Word Splitting Khmer?

I am working on a solution to split long lines of Khmer (the Cambodian language) into individual words (in UTF-8). Khmer does not use spaces between words. There are a few solutions out there, but they are far from adequate (here and here), and…

python nlp word-boundary text-segmentation southeast-asian-languages

asked Feb 01 '11 at 10:48

Nathan

1,483
3
18
41

votes

4 answers

php regex word boundary matching in utf-8

I have the following php code in a utf-8 php file: var_dump(setlocale(LC_CTYPE, 'de_DE.utf8', 'German_Germany.utf-8', 'de_DE',…

php regex utf-8 pcre word-boundary

asked Mar 12 '10 at 13:08

tomsv

7,207
6
55
88

votes

4 answers

How can I find repeated words in a file using grep/egrep?

I need to find repeated words in a file using egrep (or grep -e) in unix (bash) I tried: egrep "(\<[a-zA-Z]+\>) \1" file.txt and egrep "(\b[a-zA-Z]+\b) \1" file.txt but for some reason these consider things to be repeats that aren't! for example,…

regex bash unix grep word-boundary

asked Oct 28 '15 at 16:37

Mouse

votes

1 answer

Regular expression to match boundary between different Unicode scripts

Regular expression engines have a concept of "zero width" matches, some of which are useful for finding edges of words: \b - present in most engines to match any boundary between word and non-word characters \< and \> - present in Vim to match only…

regex unicode character-properties word-boundary word-boundaries

asked May 11 '13 at 01:39

hippietrail

15,848
18
99
158

votes

1 answer

Dollar Sign "\$" in Regular Expressions with word boundaries "\b" (PHP / JavaScript)

I am aware that the issue involving the dollar sign "$" in regex (here: either in PHP and JavaScript) has been discussed numerous times before: Yes, I know that I need to add a backslash "\" in front of it (depending on the string processing even…

javascript php regex dollar-sign word-boundary

asked Sep 30 '15 at 17:00

GerZah

2 3

…

11 12 Next