Regexp Backslash - GNU Emacs Manual says that \<
matches at the beginning of a word, \>
matches at the end of a word, and \b
matches a word boundary. \b
is just as in other non-Emacs regular expressions. But it seems that \<
and \>
are particular to Emacs regular expressions. Are there cases where \<
and \>
are needed instead of \b
? For instance, \bword\b
would match the same as \<word\>
would, and the only difference is that the latter is more readable.

- 17,526
- 6
- 41
- 47
-
They’re also in GNU Grep and in Vim. – Josh Lee Apr 30 '11 at 19:36
-
3`\<` and `\>` are from the original *vi*, and remain there to this day. – tchrist Apr 30 '11 at 22:43
2 Answers
You can get unexpected results if you assume they behave the same..
What can \< and > that \b can do?
The answer is that \<
and\>
are explicit... This end of a word! and only this end!
\b
is general.... Either end of a word will match...
GNU Operators * Word Operators
line="cat dog sky"
echo "$line" |sed -n "s/\(.*\)\b\(.*\)/# |\1|\2|/p"
echo "$line" |sed -n "s/\(.*\)\>\(.*\)/# |\1|\2|/p"
echo "$line" |sed -n "s/\(.*\)\<\(.*\)/# |\1|\2|/p"
echo
line="cat dog sky"
echo "$line" |sed -n "s/\(.*\)\b\(.*\)/# |\1|\2|/p"
echo "$line" |sed -n "s/\(.*\)\>\(.*\)/# |\1|\2|/p"
echo "$line" |sed -n "s/\(.*\)\<\(.*\)/# |\1|\2|/p"
echo
line="cat dog sky "
echo "$line" |sed -n "s/\(.*\)\b\(.*\)/# |\1|\2|/p"
echo "$line" |sed -n "s/\(.*\)\>\(.*\)/# |\1|\2|/p"
echo "$line" |sed -n "s/\(.*\)\<\(.*\)/# |\1|\2|/p"
echo
output
# |cat dog |sky|
# |cat dog| sky|
# |cat dog |sky|
# |cat dog |sky|
# |cat dog| sky|
# |cat dog |sky|
# |cat dog sky| |
# |cat dog sky| |
# |cat dog |sky |

- 6,696
- 4
- 30
- 37
It looks to me like \<.*?\>
would match only series of word characters, while \b.*?\b
would match either series of word characters or a series non-word characters, since it can also accept the end of a word, and then the beginning of one. If you force the expression between the two to be a word, they do indeed act the same.
Of course, you could replicate the behavior of \<
and \>
with \b\w
and \w\b
. So I guess the answer is that yes, it's mostly for readability. Then again, isn't that what most escape characters in regular expression are for?

- 8,416
- 7
- 51
- 90
-
The Escape char `\\` is never for readability. It is used to differentiate a *regex operator* from a *literal* character of the same glyph – Peter.O Apr 30 '11 at 23:35
-
@fred - What I meant was that the escaped characters such as `\w` and `\d` (not `\ ` itself) can usually be replaced with other characters of a character class, like `[0-9]`. – dlras2 May 01 '11 at 03:13
-
Daniel: `\<.*\>` will match any string bounded by word characters. The `.*` is greedy, so matches as many arbitrary characters as possible. To match only individual words, you could use a non-greedy variant: `\<.*?\>` – phils May 01 '11 at 09:14