2

I want to make a simple gawk script to extract the nth column of some file. The name of the file and the value for n I want to be entered at the command line. This script I make executable with chmod +x.

Thus to extract the third column from the file foo I would enter:

 awkextract foo 3 

My attempt at the script awkextract is:

 #!/opt/local/bin/gawk -v k=$2 -f 
 {print $k}

But the nonsense results show that this isn't working. How do I fix it?

PS. I know I can do this via cut command, I'm just experimenting...

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
Tim
  • 291
  • 2
  • 17
  • But the script you show is not a shell script, and the ["sheebang"](https://en.wikipedia.org/wiki/Shebang_(Unix)) line is not evaluated for environment variables. Perhaps you should *make* it a shell script that then invoke `gawk` with the correct arguments? – Some programmer dude Aug 28 '17 at 12:29
  • You also use the `-f` option of `gawk` wrongly, it's supposed to have an argument, which is the file-name of the `gawk` script to use. See e.g. [this reference for invoking `gawk` for more information](https://www.gnu.org/software/gawk/manual/gawk.html#Invoking-Gawk). – Some programmer dude Aug 28 '17 at 12:31
  • 1
    @Someprogrammerdude For a standalone awk executable, `-f` is the right choice on the shebang line, see for example [this question](https://unix.stackexchange.com/questions/239415/how-to-convert-awk-one-liner-to-standalone-script) and [here in the manual](https://www.gnu.org/software/gawk/manual/gawk.html#Executable-Scripts). – Benjamin W. Aug 28 '17 at 13:59

2 Answers2

5

Don't call awk via a shebang, just put this in your shell script:

/opt/local/bin/gawk -v k="$2" '
{print $k}
' "$1"
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • That can of course all be on one line, assuming it is cosmetically pleasing that way (perhaps you'll have more code in your real work), e.g. `/opt/local/bin/gawk -v k="$2" '{print $k}' "$1"` (it can get even smaller, but I suggest retaining spaces for legibility). – Adam Katz Aug 28 '17 at 14:22
  • this solution very clear and simple, but I realised afterwards that I was calling awk via shebang partly to avoid quotation and ticks that troubled me in more complicated examples. I should have clarified this in the original question. – Tim Aug 28 '17 at 15:36
  • There are no problems with quotes and ticks (except the very minor annoyance that you can't use a `'` in a `'`-delimited script so you have to use `\047` instead), just use them correctly. Calling awk via a shebang **causes** problems, it doesn't solve them. If you have any specific problems in mind that you think using a shebang will solve then post a question about that. – Ed Morton Aug 28 '17 at 16:17
1

As pointed out in Ed Morton's answer, the easiest way to do this is to wrap it in a shell script. It's not impossible to do it in an awk executable, though, albeit very unwieldy:

#!/usr/local/bin/awk -f

BEGIN {
    col = ARGV[2]
    ARGV[2] = ""
}

{ print $col }

/usr/local/bin/awk is just the path to awk on my machine.

In the BEGIN block, we're manipulating the argument list directly: we set col to the second command line argument, then set that second argument to the empty string. The ARGV array contains all the command line arguments and is zero-indexed, with ARGV[0] usually containing awk (but this depends on your system), so for the command ./awkextract foo 3, ARGV[1] is foo and ARGV[2] is 3.

Now the only non-null argument left in ARGV is the name of the file to be processed, and the { print $col } action is run for each line of it.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
  • this allows me to keep the awk executable and avoids the problems of quotes and ticks which would be necessary in more complicated examples. – Tim Aug 28 '17 at 15:36
  • 1
    Again - there are no problems with quotes and ticks, certainly none that'll be solved by using a shebang (except being able to use `'` instead of `\047`) and using a shebang just causes you headaches as it robs you of the ability to separate shell args at the shell level into awk args and file names and otherwise use shell and awk as appropriate for what they're best at. – Ed Morton Aug 28 '17 at 15:56
  • 2
    @Tim Putting all your awk commands in a script and calling it with `awk -f script.awk` does the same for you with respect to quotes. This answer might solve your specific problem, but the approach isn't good practice to begin with. – Benjamin W. Aug 28 '17 at 16:02
  • @BenjaminW remember to decrement ARGC too as awk will retain a null string in ARGV[2] despite you trying to delete it. See point "5" at http://cfajohnson.com/shell/cus-faq-2.html#Q24. – Ed Morton Aug 28 '17 at 16:29
  • @EdMorton Wow, that's an exhaustive list... I'll change to explicitly setting to the empty string instead of deleting the element and decrementing `ARGC` – Benjamin W. Aug 28 '17 at 17:23
  • 1
    Yeah, I wrote it about 15 years ago IIRC and it could probably stand some updating and general tidying up but the info's there and my enthusiasm isn't plus that FAQ isn't actively maintained any longer, so it may be a while :-). – Ed Morton Aug 28 '17 at 17:59