This is really a dark corner of awk....
I had the same doubt about 5 years ago. I submitted as bug and talked to a developer of gawk, and finally got clear. It is a "feature".
Here is the ticket: https://lists.gnu.org/archive/html/bug-gawk/2013-03/msg00009.html
split(str, array, magic)
For magic
:
when you use a non-empty string (quoted by ""
) "..."
, awk will check the length of the string, if it is single char, it will be used as literal string (they call it separator). However if it is longer than 1
, it will be treated as a dynamic regex.
when you use static regex, which means, in format /.../
, no matter how long is the expression, it will be always treated as regex.
That is:
"." - literal "." (period)
"[" - literal "["
"{" - literal "{"
".*" - regex
/./ - regex
/whatever/ -regex
If you want awk to treat .(period)
as regex metacharacter, you should use split(foo,bar,/./)
But if you split by any char, you may have empty arrays, if this is what you really want.