2

The Beta release of gawk 4.2.0, available in http://www.skeeve.com/gawk/gawk-4.1.65.tar.gz is a major release, with many significant new features.

I previously asked about What is the behaviour of FS = " " in GNU Awk 4.2?, and now I noticed the brand new typeof() function to deprecate isarray():

Changes from 4.1.4 to 4.2.0

  1. The new typeof() function can be used to indicate if a variable or array element is an array, regexp, string or number. The isarray() function is deprecated in favor of typeof().

I could cover four cases: string, number, array and unassigned:

$ awk 'BEGIN {print typeof("a")}'
string
$ awk 'BEGIN {print typeof(1)}'
number
$ awk 'BEGIN {print typeof(a[1])}'
unassigned
$ awk 'BEGIN {a[1]=1; print typeof(a)}'
array

However, I struggle to get "regexp" since none of my attempts reach that and always yield "number":

$ awk 'BEGIN {print typeof(/a/)}'
number
$ awk 'BEGIN {print typeof(/a*/)}'
number
$ awk 'BEGIN {print typeof(/a*d/)}'
number
$ awk 'BEGIN {print typeof(!/a*d/)}'
number
$ awk -v var="/a/" 'BEGIN{print typeof(var)}'
string
$ awk -v var=/a/ 'BEGIN{print typeof(var)}'
string

How can I get a variable to be defined as "regexp"?

I noticed the previous bullet:

  1. Gawk now supports strongly typed regexp constants. Such constants look like @/.../. You can assign them to variables, pass them to functions, use them in ~, !~ and the case part of a switch statement. More details are provided in the manual.

And tried a bit, but with no luck:

$ awk -v pat=@/a/ '{print typeof(pat)}' <<< "bla ble"
string
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598

1 Answers1

2

typeof(/a/) is running typeof() on the result of $0 ~ /a/ which is a number. I haven't tried this yet myself but I'd expect this to be what you're looking for:

typeof(@/a/)

and

var = @/a/
typeof(var)

So this works:

$ awk 'BEGIN {print typeof(@/a/)}'
regexp

$ awk 'BEGIN {var=@/a/; print typeof(var)}'
regexp
fedorqui
  • 275,237
  • 103
  • 548
  • 598
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • `awk 'BEGIN{print typeof(@/a/)}'` works and also the other one! I also tried `awk -v var=@/a/ 'BEGIN{print typeof(var)}'` and does not, I wonder how this regexp can be passed properly. – fedorqui Oct 10 '17 at 13:26
  • Did you try quoting it `-v var='@/a/'`? By not quoting it you're exposing it to the shell to do wildcard expansion, etc. When you do `-v var=@/a/` what do you get when you do `print "<" var ">"` in the BEGIN section? It;s also possible that you simply can't assign a regexp constant on the command line and to do what you want should be written as `-v foo="a" 'BEGIN{var=@($0 ~ foo)` or similar. – Ed Morton Oct 10 '17 at 13:27
  • Both `awk -v var='@/a/' 'BEGIN{print "<" var ">", typeof(var)}'` and `awk -v var=@/a/ 'BEGIN{print "<" var ">", typeof(var)}'` return the same: "<@/a/> string". – fedorqui Oct 10 '17 at 13:32
  • 1
    I am accepting your good answer since it is answering my initial question. What we are commenting here is tangential – fedorqui Oct 10 '17 at 13:36
  • Regarding your suggestion to use `awk -v foo="a" 'BEGIN{var=@($0 ~ foo)}'`, no: `var=@($0 ~ foo)` is not accepted by awk and return a syntax error pointing to the opening parenthesis. – fedorqui Oct 10 '17 at 13:37
  • 1
    If you want to get the answer straight from the horse's mouth then you could just ask [in the comp.lang.awk NG](https://groups.google.com/d/msg/comp.lang.awk/UnoZTItfiko/zRzXa7YYBAAJ) and one of the gawk guys (Arnold or Andy or Manuel) will almost certainly respond. – Ed Morton Oct 10 '17 at 13:43