1

Why does this 'awk' command produce nothing?

   echo 'hello world' | awk '/hello\s/ {print $0}'

I suppose the pattern /hello\s/ should match any line that has 'hello' followed by a whitespace, right?

For info, I am using awk in a Mac OS. The awk version is 20070501.

mklement0
  • 382,024
  • 64
  • 607
  • 775
zell
  • 9,830
  • 10
  • 62
  • 115
  • 2
    I doesn't produce nothing here (GNU awk). Are you using an awk that doesn't recognize `\s`, as for example mawk? – Benjamin W. Feb 19 '16 at 14:29
  • I use `gawk` and it works... as Benjamin says, `\s` is not available in the 'plain' awk. – Matt Hall Feb 19 '16 at 14:33
  • @Benjamin Does your mawk mean the awk in Mac OS? – zell Feb 19 '16 at 14:33
  • @kwinkunks But the awk manual claims they accept standard reg expression. – zell Feb 19 '16 at 14:35
  • 1
    `\s` was (I think) invented by *perl*. Your awk uses regular expressions as defined here: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man7/re_format.7.html#//apple_ref/doc/man/7/re_format – glenn jackman Feb 19 '16 at 14:35
  • 1
    Mawk is "Mike's awk". Not sure about BSD awk, but it looks like `\s` is a GNUism. – Benjamin W. Feb 19 '16 at 14:38
  • @glennjackman I learned Regex from http://regexone.com/. Do you mean the regex on that page is different from the re-format you refer to? – zell Feb 19 '16 at 15:03
  • 1
    @zell, there are many different implementations of regular expression engines. Just about every programming language writes their own. They are *mostly* the same, but there are differences. If you're using regex in tool X you really need to consult X's documentation – glenn jackman Feb 19 '16 at 16:25
  • 1
    @BenjaminW., I see in that re_format man page that there are "enhanced" regular expressions that include the usual shortcut escapes, but the tool would have to have that included as a compile-time option, and you'd expect the tool's documentation to state the use of enhanced regexes – glenn jackman Feb 19 '16 at 16:27
  • @glennjackman: Indeed. The only OSX utility (as of OSX 10.11.3) that I've found to support the _enhanced_ flavor is `grep`. (Also note that _enhanced_ is independent of _basic_ vs. _extended_, and that there are _enhanced extended_ as well as _enhanced basic_ regexes; see [this answer](http://stackoverflow.com/a/23146221/45375) of mine.) – mklement0 Feb 19 '16 at 17:50
  • It's really frustrating how regex is a multiverse. – glenn jackman Feb 19 '16 at 20:28

2 Answers2

3

This works on OS X:

echo 'hello world' | awk '/hello[[:space:]]/ {print $0}'

As mentioned in the gawk docs (paraphrasing):

Think of \s like shorthand for [[:space:]]

You can also use [[:blank:]] to limit to space and tab only.

Having trouble finding some 'plain' awk docs. This seems legit, despite the name of the page.

Matt Hall
  • 7,614
  • 1
  • 23
  • 36
  • Nicely done; note, however, that you're linking to a _GNU_ Awk documentation page. The BSD `awk`'s man page on OSX refers to `man re_format` for the supported regex features, but, unfortunately, `awk` does _not_ support all the features described there, notably not `[[:<:]]` and `[[:>:]]` for word-boundary assertions. – mklement0 Feb 19 '16 at 14:50
  • @glennjackman has found the online version of `man re_format`: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man7/re_format.7.html#//apple_ref/doc/man/7/re_format – mklement0 Feb 19 '16 at 14:52
  • 1
    This is an old answer, but I thought I'd point out that even `gawk` didn't recognize `\s` as a shorthand for `[[:space:]]` [until version 4](https://www.gnu.org/software/gawk/manual/html_node/Feature-History.html). I just came across this problem on an old CentOS 6.3 machine, which has version 3.1.7 installed. – user8153 Oct 13 '17 at 08:22
0
echo 'hello world' | awk  '/^hello / {print $0}'

This looks for every line that starts with "hello "

mzakaria
  • 599
  • 3
  • 21