1

I have a command that gives the following output:

#sec one
a : same
b : red
c : one
d :
e :
f :

#sec two
a : same
b : blue
c : two
d :
e :

#sec three
a : different
b : green
c : three
d :
e :

#sec four
a : different
b : yellow
c : four

#sec five
a : different
b : pink
c : five

There are a lot of such sections. I need only the sections that have a : same and the value of b and c fields for those sections.

Sample output:

#sec one
a : same
b : red
c : one


#sec two
a : same
b : blue
c : two

This is what I've done so far! Tr -s to make it equally spaced.

mycommand | tr -s " " | cut -d ':' -f 2

Does anyone know another way of doing it or using conditionals in cut statements?

Paulo Mattos
  • 18,845
  • 10
  • 77
  • 85
Arohi Gupta
  • 95
  • 1
  • 8
  • I would suggest doing this sort of structured parsing in a language other than Bash. It's probably doable, but it's going to be a pain. Try Python, if you've never used it before it'll be a fun exercise. – dimo414 Jun 26 '17 at 23:14
  • Can't be done using awk as well? – Arohi Gupta Jun 26 '17 at 23:19
  • Yes, it's absolutely trivial with awk. Do you want the b and c sections printed because they start with the letters b and c or because they are non-empty on the right of the `:`? – Ed Morton Jun 27 '17 at 05:03

4 Answers4

1

Maybe awk can help you here ;) Try this:

mycommand | tr -d " " | awk -F: '/a:/ {a=$2;} /(b:|c:)/ {if (a == "same") print $2;}'

output:

red
one
blue
two

If you need the field names as well, just replace $2 with $0 in the last print:

mycommand | tr -d " " | awk -F: '/a:/ {a=$2;} /(b:|c:)/ {if (a == "same") print $0;}'

output:

b:red
c:one
b:blue
c:two

By the way, tested on macOS 10.12.4 running awk version 20070501.

Paulo Mattos
  • 18,845
  • 10
  • 77
  • 85
0

awk to the rescue!

$ awk -v RS= -F'\n' '/a : same/{print $1; 
                                for(i=2;i<=NF;i++) if($i~/^(a|b|c)/) print $i; 
                                print ""}' file    

#sec one                                                                                                                                  
a : same                                                                                                                                  
b : red                                                                                                                                   
c : one                                                                                                                                   

#sec two                                                                                                                                  
a : same                                                                                                                                  
b : blue                                                                                                                                  
c : two       
karakfa
  • 66,216
  • 7
  • 41
  • 56
0

I find when you have name->value pairs in your input it's best to first create an array that represents that mapping and then you can get at field values by just using their names, e.g.:

$ cat tst.awk
BEGIN { RS=""; ORS="\n\n"; FS=OFS="\n" }
{
    delete n2v
    for (i=2;i<=NF;i++) {
        name = value = $i
        sub(/[[:space:]]*:.*/,"",name)
        sub(/^[^:]+:[[:space:]]*/,"",value)
        n2v[name] = value
    }
}
n2v["a"] == "same" { print $1, p("a"), p("b"), p("c") }
function p(n) { return (n " : " n2v[n]) }

$ awk -f tst.awk file
#sec one
a : same
b : red
c : one

#sec two
a : same
b : blue
c : two

That way you can trivially and robustly modify your script to print whatever fields you want for whatever reasons you want in whatever order you want by just tweaking the last 2 lines of the script.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

Two one-liners:

  1. GNU grep method:

    grep --group-separator= -B1 -A2 '^a : same$' input_file
    

    Output:

    #sec one
    a : same
    b : red
    c : one
    
    #sec two
    a : same
    b : blue
    c : two
    
  2. A little buffer juggling with sed:

    sed -n '/^a : same$/{x;p;x;p;n;p;n;p;z;p};h' input_file
    

    Output:

    #sec one
    a : same
    b : red
    c : one
    
    #sec two
    a : same
    b : blue
    c : two
    

    How it works:

    • /^a : same$/ finds the section to print, but it's never the first line, (there's always a preceding comment line), so the first code that's executed is h, which overwrites whatever's in the "hold" buffer with the current line.

    • So the next cycle, the hold buffer always contains the previous line, and the pattern buffer contains the current line.

    • When /^a : same$/ is true, the code in curly braces is run. It exchanges the pattern and hold buffers, prints what was in
      the hold buffer, (i.e. the comment line), exchanges them back, prints the pattern buffer, (i.e. the search string), twice it gets the next line and prints it, after that it zaps the pattern buffer, (deletes it), and prints that, (i.e. prints a blank line).
agc
  • 7,973
  • 2
  • 29
  • 50