Get the n-th range by pattern

Question

My input is like this:

start
content A
end
garbage
start
content B
end

I want to extract the second (or first, or third ...) start .. end block. With

sed -ne '/start/,/end/p'

I can filter out the garbage, but how do I get just "start content B end"?

why `sed`? For the efficiency reasons? Because with `awk`, albeit slower, the script is easier to grok in this case. — , Feb 16 '11 at 13:47

score 2 · Accepted Answer · 2011-02-18T08:05:22.410

2

But anyway, if you want sed - you get sed:)

/^start$/{
  x
  s/^/a/
  /^aaa$/{
    x
    :loop
    p
    /^end$/q
    n
    bloop
  }
  x
}

The number of a's in the middle match equals to which segment you want to get. You could also have it in regexp repetion like Dennis noted. That approach allows for specifying direct number to the script.

Note: the script should be run with -n sed option.

edited Feb 18 '11 at 08:05

answered Feb 16 '11 at 13:59

1

+1 You can use an actual number for *n* instead of a sequence of characters of length *n*: `/^a\{3\}$/{`. Note that your script needs to be run with `sed -n`. – Dennis Williamson Feb 16 '11 at 17:12

kurumi · Answer 2 · 2011-02-16T13:55:13.813

1

Get all range

$ awk 'BEGIN{RS="end";FS="start"}{ print $NF}' file

content A


content B

Get 2nd range

$ awk 'BEGIN{RS="end";FS="start"}{c++; if (c==2) print $NF}' file

content B

Ruby(1.9+), get first range

$ ruby -0777 -ne 'puts $_.scan(/start(.*?)end/m)[0]' file

content A

edited Feb 16 '11 at 13:55

answered Feb 16 '11 at 13:49

kurumi

25,121
5
44
52

The other way, for traditionalists:) `awk '/start/{c++}c==n&&/start/,/end/'` – Feb 16 '11 at 14:29
You can `print FS $NF RS` if you want the block markers included in the output. – Dennis Williamson Feb 16 '11 at 17:16

Get the n-th range by pattern

2 Answers2