54

I expect the code below to echo "yes", but it does not. For some reason it won't match the single quote. Why?

str="{templateUrl: '}"
regexp="templateUrl:[\s]*'"

if [[ $str =~ $regexp ]]; then
  echo "yes"
else
  echo "no"
fi

4 Answers4

103

Replace:

regexp="templateUrl:[\s]*'"

With:

regexp="templateUrl:[[:space:]]*'"

According to man bash, the =~ operator supports "extended regular expressions" as defined in man 3 regex. man 3 regex says it supports the POSIX standard and refers the reader to man 7 regex. The POSIX standard supports [:space:] as the character class for whitespace.

The GNU bash manual documents the supported character classes as follows:

Within ‘[’ and ‘]’, character classes can be specified using the syntax [:class:], where class is one of the following classes defined in the POSIX standard:

alnum alpha ascii blank cntrl digit graph lower print
punct space upper word xdigit

The only mention of \s that I found in the GNU bash documentation was for an unrelated use in prompts, such as PS1, not in regular expressions.

The Meaning of *

[[:space:]] will match exactly one white space character. [[:space:]]* will match zero or more white space characters.

The Difference Between space and blank

POSIX regular expressions offer two classes of whitespace: [[:space:]] and [[:blank:]]:

  • [[:blank:]] means space and tab. This makes it similar to: [ \t].

  • [[:space:]], in addition to space and tab, includes newline, linefeed, formfeed, and vertical tab. This makes it similar to: [ \t\n\r\f\v].

A key advantage of using character classes is that they are safe for unicode fonts.

Community
  • 1
  • 1
John1024
  • 109,961
  • 14
  • 137
  • 171
  • 4
    Note that `[:space:]` means all whitespace, including carriage returns and newlines; while `[:blank:]` means "horizontal" whitespace (spaces and tabs) -- http://www.regular-expressions.info/posixbrackets.html – glenn jackman Jan 31 '15 at 22:52
  • 1
    For just matching a literal space, you can also escape it with a backslash, i.e.: `regexp="templateUrl:\ *'"` – Christoph Thiede Oct 06 '21 at 22:53
  • 1
    @ChristophThiede Yes, that's true. Actually, though, you don't need the backslash. `regexp="templateUrl: *'"` also works. In either case, of course, this limits the regular expression to matching an actual ASCII blank. The other whitespace characters that may be recognized by `[[:blank:]]` or `[[:space:]]` are not matched. – John1024 Oct 07 '21 at 23:57
4

Get rid of the square brackets in the regular expression:

regexp="templateUrl:\s*'"

With the square brackets present, the \s inside gets interpreted literally as matching either the \ or s characters, but your intent is clearly to match against the white space character class for which \s is shorthand (and therefore no square brackets needed).

$ uname -a
Linux noname 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ bash --version
GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law.
$ cat test.sh
str="{templateUrl: '}" 
regexp="templateUrl:\s*'"

if [[ $str =~ $regexp ]]; then
  echo "yes"
else
  echo "no"
$ bash test.sh
yes 
rchang
  • 5,150
  • 1
  • 15
  • 25
  • Did you test it? Using `regexp="templateUrl:\s*'"` still echo's "no" for me. –  Jan 31 '15 at 20:42
  • I ran your script verbatim - and it echoed `yes` for me. I'm running on a Linux Mint 17 box. I'll update the answer to reflect as such. – rchang Jan 31 '15 at 20:43
  • 2
    You are right, I switched to a Mac and got different results from my Linux box. It appears that bash on OS X (at least the flavors that you and I have) defaults to strict POSIX notation - you should go with the answers from @John1024 or heemayl – rchang Jan 31 '15 at 21:03
3

This should work:

#!/bin/bash
str="{templateUrl: '}"
regexp="templateUrl:[[:space:]]*'"

if [[ $str =~ $regexp ]]; then
  echo "yes"
else
  echo "no"
fi

If you want to match zero or more whitespaces the * needs to added after [[:space:]].

heemayl
  • 39,294
  • 7
  • 70
  • 76
0

This is another way that work, if you want only the space from the space character class.

#!/bin/bash
str="{templateUrl: '}"
if [[ $str =~ templateUrl:" "*"'" ]]; then
  echo "yes"
else
 echo "no"
fi

credit to Malak Younes.