Find specific pattern and print complete text block using awk or sed

Question

How can find a specific number in a text block and print the complete text block beginning with the key word "BEGIN" and ending with "END"? Basically this is what my file looks like:

BEGIN
A: abc
B: 12345
C: def
END

BEGIN
A: xyz
B: 56789
C: abc
END

BEGIN
A: ghi
B: 56712
C: pqr
END

[...]

If I was looking for '^B: 567', I would like to get this output:

BEGIN
A: xyz
B: 56789
C: abc
END

BEGIN
A: ghi
B: 56712
C: pqr
END

I could use grep here (grep -E -B2 -A2 "^B: 567" file), but I would like to get a more general solution. I guess awk or sed might be able to do this!?

Thanks! :)

Ed Morton · Accepted Answer · 2013-10-09T11:11:52.927

9

$ awk -v RS= -v ORS='\n\n' '/\nB: 567/' file
BEGIN
A: xyz
B: 56789
C: abc
END

BEGIN
A: ghi
B: 56712
C: pqr
END

Note the \n before B to ensure it occurs at the start of a line.This is in place of the ^ start-of-string character you had originally since now each line isn't it's own string. You need to set ORS above to re-insert the blank line between records.

edited Oct 09 '13 at 11:11

answered Oct 09 '13 at 11:06

Ed Morton

188,023
17
78
185

score 5 · Answer 2 · answered Oct 09 '13 at 06:15

5

This might work for you (GNU sed):

sed -n '/^BEGIN/{x;d};H;/^END/{x;s/^B: 567/&/mp}' file

or this:

sed -n '/^BEGIN/!b;:a;$!{N;/\nEND/!ba};/\nB: 567/p' file

answered Oct 09 '13 at 06:15

potong

55,640
6
51
83

Birei · Answer 3 · 2013-10-09T12:51:44.260

2

You can undef RS to split records in blank lines and check if the string matches in the whole block:

awk 'BEGIN { RS = "" } /\nB:[[:space:]]+567/ { print $0 ORS }' infile

It yields:

BEGIN
A: xyz
B: 56789
C: abc
END 

BEGIN
A: ghi
B: 56712
C: pqr
END

edited Oct 09 '13 at 12:51

answered Oct 08 '13 at 20:27

Birei

35,723
2
77
82

1

You don't need `$0 ~` in `$0 ~ /B:[[:space:]]+567/` and you should get rid of the comma in `print $0, ORS` so you don't add a space character after every `END` in the output. You really need to anchor the `B` too in case `B: 567` shows up as text on an `A:...` line, for example. – Ed Morton Oct 09 '13 at 11:21
1

You really should make it `\nB` not just `B`. – Ed Morton Oct 09 '13 at 12:47

anubhava · Answer 4 · 2013-10-09T11:25:14.493

2

This awk should work:

awk -v s='B: 567' '$0~s' RS= file
BEGIN
A: xyz
B: 56789
C: abc
END
BEGIN
A: ghi
B: 56712
C: pqr
END

edited Oct 09 '13 at 11:25

answered Oct 08 '13 at 20:42

anubhava

761,203
64
569
643

1

@EdMorton: Any search string can be passed in this this command. – anubhava Oct 09 '13 at 11:06
That's true, it wouldn't hurt to use the string the OP needs in this example though. – Ed Morton Oct 09 '13 at 11:14

score 2 · Answer 5 · answered Oct 08 '13 at 20:46

2

A bit lenghty but the RS-trick was already posted :-)

BEGIN {found=0;start=0;i=0}


/BEGIN/ {
    start=1
    delete a
}

/.*567.*/ {found=1}

{
    if (start==1) {
        a[i++]=$0
    }
}

/END/ {
    if (found) {
        for (i in a)
            print a[i]
    }
    found=0
    start=0
    delete a
}

Output:

$ awk -f s.awk input
BEGIN
A: xyz
B: 56789
C: abc
END
BEGIN
A: ghi
B: 56712
C: pqr
END

answered Oct 08 '13 at 20:46

Fredrik Pihl

44,604
7
83
130

Will produce false matches if 567 appears anywhere on any line and can re-order the text in the output so END comes before BEGIN, or any other permutation, courtesy of `for (i in a)`. – Ed Morton Oct 09 '13 at 11:26

Vijay · Answer 6 · 2013-10-09T06:52:57.273

0

perl -lne 'if(/56789/){$f=1}
           push @a,$_;
           if(/END/){
              if($f){print join "\n",@a}
           undef @a;$f=0}' your_file

edited Oct 09 '13 at 06:52

answered Oct 09 '13 at 06:46

Vijay

65,327
90
227
319

Find specific pattern and print complete text block using awk or sed

6 Answers6

Linked