1

I am very, very much a beginner with NAWK (or AWK) but I know that you can check for a substring value using:

nawk '{if (substr($0,42,4)=="ABCD") {print {$0}}}' ${file}

(This is being run through UNIX, hence the '$0'.)

What if the string could be either ABCD or MNOP? Is there an easy way to code this as a one-liner? I've tried looking but so far only found myself lost...

HugMyster
  • 329
  • 1
  • 4
  • 14
  • Always quote your shell variables unless you have a very specific reason not to and fully understand the consequences wrt globbing and word splitting. Use `"${file}"`, not `${file}`. – Ed Morton Oct 31 '13 at 13:34

4 Answers4

3

For example with:

nawk 'substr($0,42,4)=="ABCD" || substr($0,42,4)=="MNOP"' ${file}

Note your current command does have some unnecessary parts that awk handles implicitly:

nawk '{if (substr($0,42,4)=="ABCD") {print {$0}}}' ${file}

{print {$0}} is the default awk action, so it can be skipped, as well as the if {} condition. All together, you can let it be like

nawk 'substr($0,42,4)=="ABCD"' ${file}

For more reference you can check Idiomatic awk.

Test

$ cat a
hello this is me
hello that is me
hello those is me

$ awk 'substr($0,7,4)=="this"' a
hello this is me

$ awk 'substr($0,7,4)=="this" || substr($0,7,4)=="that"' a
hello this is me
hello that is me
fedorqui
  • 275,237
  • 103
  • 548
  • 598
2

If you have a large list of possible valid values, you can declare an array, then check to see if that substring is in the array.

nawk '
    BEGIN { valid["ABCD"] = 1 
            valid["MNOP"] = 1
            # ....
    }
    substr($0,42,4) in valid
' file

One thing to remember: the in operator looks at an associative array's keys, not the values.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
1

Assuming your values are not regex metacharacters, you could say:

nawk 'substr($0,42,4)~/ABCD|MNOP/' ${file}

If the values contain metacharacters ([, \, ^, $, ., |, ?, *, +, (, )), then you'd need to escape those with a \.

devnull
  • 118,548
  • 33
  • 236
  • 227
  • 1
    Or even `substr($0,42,4)~"ABCD|MNOP"` – fedorqui Oct 31 '13 at 12:22
  • 1
    I was typing *awk* instead of *nawk* (man, I feel like such a plonker!) As soon as I *nawked* my *awk*, the tilda-bar command worked very nicely. Very many thanks, Peeps!!! – HugMyster Oct 31 '13 at 12:46
  • Use RE constants, not string constants, when applying an RE comparison, i.e. use `/` to delimit your REs, not `"`. As written you'd need to double-escape every metacharacter - `/\./` is better than `"\\."`. – Ed Morton Oct 31 '13 at 13:25
  • @EdMorton That's what I'd written but some [awk experts](http://stackoverflow.com/users/1983854/fedorqui) suggested that I should be using `"` instead. – devnull Oct 31 '13 at 13:29
  • @EdMorton You might want to direct your message in response to [this comment](http://stackoverflow.com/questions/19706227/how-do-i-check-for-a-nawk-substring-being-several-possible-values/19706398?noredirect=1#comment29272639_19706398) instead. – devnull Oct 31 '13 at 13:29
  • I was directing my comment to the OP. Using `"` in this context is just plain wrong as it complicates your code if you do have RE metacharacters and makes it LOOK like you're doing a string comparison when you're not. – Ed Morton Oct 31 '13 at 13:31
  • @EdMorton It was foolish on my part to change it on the basis of somebody's comment. People on SO have a tendency to keep remarking on other answers especially when they have themselves posted one. – devnull Oct 31 '13 at 13:34
  • I didn't suggest you should be using `"` instead, just pointed out that it could be another way to do it... I like doing it and people normally appreciate it. I don't like each one of the people answering to just keep looking at their post and make it a simple competition of who is getting it accepted. – fedorqui Oct 31 '13 at 13:42
  • @fedorqui But the manner in which it was __suggested__ essentially implied that it was currently being done __incorrectly__. – devnull Oct 31 '13 at 13:45
  • Was it? Believe me it was not. I normally check other answers before commenting and yours was (is, in fact) working fine. Just found out another way and preferred to comment instead of getting your idea in my post - to avoid *copying* it. – fedorqui Oct 31 '13 at 13:47
1

You said "string" not "RE" so this is the approach to take for a string comparison against multiple values:

awk -v strs='ABCD MNOP' '
BEGIN {
    split(strs,tmp)
    for (i in tmp)
        strings[tmp[i]]
}
substr($0,42,4) in strings
' file
Ed Morton
  • 188,023
  • 17
  • 78
  • 185