30

Why does

echo foo bar..baz bork | awk 'BEGIN{RS=".."} {gsub(OFS,"\t");}1'

seem to do the same thing as

echo foo bar..baz bork | awk 'BEGIN{RS=".."} {gsub(OFS,"\t");} {print;}'

?

In fact any number that isn't zero (including decimals and negatives) will do the same thing. However, leaving off the digit, using a text character, or using zero prints nothing. I didn't see this documented anywhere, although I could have missed something.

shadowtalker
  • 12,529
  • 3
  • 53
  • 96
  • You can shorten code some: `awk '-v RS=".." {gsub(OFS,"\t")}1'` or put the variable behind code `awk '{gsub(OFS,"\t")}1' RS=".."` – Jotne Jul 09 '14 at 09:12

2 Answers2

42

If you remember, awk is a language which has a series of <pattern> <action> operations. Each pattern is evaluated for each line (at least conceptually), and when the pattern matches, the action is executed. Either the pattern or the action can be omitted. An omitted pattern matches every line; an omitted action defaults to {print $0} (aka {print}). The 'pattern' might be a simple regex match, or some other more complicated and general condition, which must evaluate to true if the action is to be executed (as noted by Ed Morton in his comment).

In your example, the 1 is a pattern; it evaluates to true. The action is not specified, so the default action is invoked, which is {print} or {print $0}. Any value other than zero or an empty string evaluates to true and will invoke the print. (Note that if you mention an uninitialized variable (for example, c), then it is autocreated and set to zero and therefore evaluates to false. Hence awk 'c' <<<"Hi" prints nothing.)

The actions associated with the BEGIN and END patterns are handled specially, of course.

Community
  • 1
  • 1
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    Makes sense. Somehow I thought AWK was parsing it as part of the previous action statement, kind of like how `d` is used in sed. – shadowtalker Jul 09 '14 at 00:00
  • 1
    You might care to note that all the semicolons in your examples are valid but superfluous. – Jonathan Leffler Jul 09 '14 at 00:02
  • 4
    I know most text books refer to it as a `pattern`, but IMHO it makes much more sense to call it a `condition` as that's what it really is. I believe having a pattern there was the original intent of 1970s awk but that was a long, long time ago... – Ed Morton Jul 09 '14 at 00:03
  • 1
    Any number will do like `5`, `2345` or even negative number works `-54`, but most use `1` – Jotne Jul 09 '14 at 09:15
  • 1
    An omitted action defaults to `{print $0}` (aka `{print}`), but omit `1` not work why? seems `1` is not equal to `{print;}`. Though, the output of `1` and `{print;}` are same. That's really confusing, could you explain this? – Mithril Oct 28 '16 at 03:08
  • @Mithril: I'm sorry, but I'm unable to make head or tail of your question. What do you mean by 'omit `1` not work why'? What I was trying to say is that `awk '1' data` and `awk '{print}' data` will both print the whole content of the file `data`. What scenarios are you trying? – Jonathan Leffler Oct 28 '16 at 03:47
  • 1
    @Mithril This type of shortcut is understandably very confusing. `1` is not technically equal to `{print}`. `awk`'s format is `condition {action}`. `1` is the `condition`, which is evaluated to determine whether to perform the following `action`. Nonzero evaluates to `true`. The `action` is missing, so the default action `{print}` is performed. See my answer for more examples: https://stackoverflow.com/a/46248960 – wisbucky Sep 16 '17 at 00:33
  • I guess @wisbucky made a good point in his answer... an important part in the doubt lies in the meaning of the ```1``` after the closing braces... and as he pointed, ```1``` indeed is the *condition* (1 means true) of the next *action* (empty action means print). As @jonathan-leffler 's answer was chosen, maybe he could edit it and add this info. – LEo Jun 15 '20 at 17:35
  • @LEo: Since my answer currently says (in part): _"In your example, the `1` is a pattern; it evaluates to true. The action is not specified, so the default action is invoked, which is `{print}` or `{print $0}`"_. What information is missing from that? I'm using the term 'pattern' instead of 'condition' — that probably dates when I learned AWK (hint: it was before GNU existed), but is otherwise indistinguishable from 'condition'. – Jonathan Leffler Jun 15 '20 at 18:18
12

I really dislike these types of shortcuts because it obfuscates and misleads how it's being parsed. Like you said,

awk 'BEGIN{RS=".."} {gsub(OFS,"\t");}1'

seems to be equivalent to

awk 'BEGIN{RS=".."} {gsub(OFS,"\t");} {print;}'

which would seem to imply that 1 is simply an alias for {print}. But that's not the case at all. 1 is not associated with the previous bracket. It is actually part of a second statement, which has no action, so it uses a default action of {print}. You can think of it like this instead.

awk 'BEGIN{RS=".."} {gsub(OFS,"\t")}; 1!=0 {print}'

Here's an example that I think demonstrates better the condition {action} format that awk uses:

echo 'a b c' | awk '1 {print $1}; 2 {print $2}; 0 {print $3}'

a and b are printed because 1 and 2 are nonzero and evaluate to true. c is not printed because 0 evaluates to false.

wisbucky
  • 33,218
  • 10
  • 150
  • 101