5

I am reading someone's awk script. Starts with the header #!/usr/bin/env awk -f. The env command does not have a -f option. So, they must be passing the -f option for the awk command. I looked at the man page for awk. It says Awk scans each input file for lines that match any of a set of patterns specified literally in prog or in one or more files specified as -f progfile. With each pattern there can be an associated action that will be performed when a line of a file matches the pattern.

As per my understanding, this means that awk processes the input file(s) by searching for lines with patterns specified in progfile/prog depending on whether or not you use -f option with awk. And based on the patterns used, an associated action is performed on the lines found in the input file(s). My question here is... how does this work while running an awk script file? We're not specifying the progfile in the #!/usr/bin/env awk -f line. What patterns will the awk script use? Or does this mean that we have to pass the progfile when we run the awk script? If that is the case, isn't specifying the -f option in the script redundant? If we don't specify the progfile, will the -f option be ignored by default or throw an error?

To understand this better, I wrote a simple awk script and saved it as test.awk

#!/usr/bin/env awk -f

BEGIN { print "START" }

When I run this, the string "START" gets printed on the screen.

prachis-mbp-2:~ pskhadke$ ./test.awk
START

If I remove the -f option from the first line of the awk script and run it, I get the following error:

prachis-mbp-2:~ pskhadke$./test.awk
awk: syntax error at source line 1
 context is
     >>> . <<< /test.awk

Similarly,

prachis-mbp-2:~ pskhadke$ awk test.awk
awk: syntax error at source line 1
 context is
     >>> test. <<< awk
awk: bailing out at source line 1

So for some reason, it's failing to parse the arguments correctly without the -f option. But why?

Prachi
  • 528
  • 8
  • 31

4 Answers4

5

The name of the file is appended to the end of the command in the shebang line. Hence the resulting command line effectively executed for a file test.awk with the header #!/usr/bin/env awk -f would be awk -f test.awk, treating test.awk as the script file to execute rather than a data input file.

The best illustration: create a file test with as sole contents #!/bin/rm, make it executable (e.g. chmod 755) and try to execute it by running ./test. Now, where did that file go :)

  • Even though other answers were correct, accepted yours cause you explained the main reason why we need -f in short. I didn't know that the script name gets appended at the end of `awk -f `. Upvoted all the correct answers. – Prachi Dec 01 '15 at 00:08
  • I haven't read the rest of the answers but hopefully someone pointed out that you should never invoke awk via a shebang. – Ed Morton Dec 01 '15 at 05:09
  • 1
    No, no one pointed out that awk should never be invoked via a shebang. Can you please explain why? – Prachi Dec 02 '15 at 00:48
  • How does this answer relate to [this one](https://unix.stackexchange.com/questions/438247/usr-bin-env-awk-f-no-such-file-or-directory)? Indeed, for me `#!/usr/bin/env awk -f` results in error. – Enlico Oct 09 '19 at 21:41
  • @EdMorton Sorry to bother you with a half a decade old comment but why _you should never invoke awk via a shebang_? – James Brown Jan 28 '20 at 15:20
  • 1
    @JamesBrown Because it has no useful advantages over simply calling awk from the shell script while it has the significant disadvantage that you can't separate the shell script args into shell args, awk args, and awk variable settings. See for example https://unix.stackexchange.com/a/563456/133219. – Ed Morton Jan 28 '20 at 15:30
  • 1
    @Prachi sorry I didn't see [your comment](https://stackoverflow.com/questions/34009196/understanding-the-awk-f-option-in-shebang-line/34009641?noredirect=1#comment55821332_34009641) above 4 years ago but [I responded now](https://stackoverflow.com/questions/34009196/understanding-the-awk-f-option-in-shebang-line/34009641?noredirect=1#comment106023530_34009641) FWIW :-). – Ed Morton Jan 28 '20 at 15:51
3

So, they must be passing the -f option for the awk command.

Yes, that is correct. The shebang line is interpreted by the kernel at invocation time. If it reads #!/usr/bin/env awk -f, then it means that when this file is invoked as an executable (i.e., when it is passed as the program argument to one of the seven exec functions), the correct way to "execute" it is by executing awk -f <filename>. In other words: the exec function will invoke the interpreter with the right arguments rather than attempting to execute the file per se (since it's not a binary).

The -f option is necessary because awk(1) reads the program from the arguments by default; if you want it to read it from a file, you need -f.

As per my understanding, this means that awk processes the input file(s) by searching for lines with patterns specified in progfile/prog depending on whether or not you use -f option with awk.

awk(1) always processes input files to look for a match. The -f option only controls where the awk program is read from. If enabled, it means that the first filename is in fact the name of a file that contains the awk program. Otherwise, the first filename is the first file to start looking for patterns. If no files are specified, it simply matches against the lines in stdin.

We're not specifying the progfile in the #!/usr/bin/env awk -f line

The kernel does that for you. Again, the shebang line is saying: when you want to execute this file (call it X), then please do it with awk -f. So it is equivalent to awk -f X.

If I remove the -f option from the first line of the awk script and run it, I get the following error:

Because then that would be the same as:

$ awk ./test.awk

Which is nonsense, because without -f, it will try to interpret ./test.awk as the awk program. So you get an error.

Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70
  • Note to who edited my answer, and to the approvers who agreed to the edit: it's the kernel that reads and interprets the shebang line. Also, please don't unformat my post by removing backticks and replacing them with `'`. – Filipe Gonçalves Apr 23 '16 at 09:09
  • Yeah, sorry - I was sleepy and I biffed it, then I completely forgot to come back and retract it. – mrh May 13 '16 at 20:14
2
#!/usr/bin/env awk -f

The string following the #! is invoked as a command after appending the name of the script.

If you read the documentation for the env command, you'll see that (in the absence of any NAME=VALUE or other options) it invokes its first argument as a command, passing any following arguments to that command. So env will invoke awk -f name-of-script.

The reason you need the -f is simply because that's how awk handles its command-line arguments. If you pass a string on awk's command line without specifying an option name, it will evaluate that string as awk code:

$ awk 'BEGIN {print "hello, world"}'
hello, world

To tell awk to execute the contents of a file, you have to use the -f option:

$ echo 'BEGIN { print "hello, world"}' > hello.awk
$ awk -f hello.awk
hello, world

This is actually a bit unusual compared to most other interpreters. The perl command, for example, treats a command-line argument as a script name by default; to pass Perl code on the command line, you have to use the -e option:

$ perl -e 'print "hello, world\n"'
hello, world

Most shells are the same.

Note that some older systems limit the number of arguments you can have on a #! line, so #!/usr/bin/env awk -f might not work.

If you know the exact location of the awk interpreter command, you can use it directly rather than using /usr/bin/env:

#!/usr/bin/awk -f

See this question and my answer for a discussion of the #!/usr/bin/env hack.

Community
  • 1
  • 1
Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
1

Shebang line will be interpreted by kernel, which will invoke interpreter specified after shebang with executable file name (your script) as argument. See man 2 execve, section "Interpreter scripts"

Nemanja
  • 1,161
  • 11
  • 13