1

assuming that we have a file containing the following:

chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
chapter 2 blah blah

and we want to grep this file so we take the lines from chapter 1 blah blah to blah num (the line before the next chapter).

The only things we know are

  1. the stating string chapter 1 blah blah
  2. somewhere after that there is another line starting with chapter

a dummy way to do this is

grep -A <num> -i "chapter 1" <file>

with large enough <num> so the whole chapter will be in it.

Cœur
  • 37,241
  • 25
  • 195
  • 267

3 Answers3

2
sed -ne '/^chapter 1/,/^chapter/{/^chapter/d;p}' file
Cyrus
  • 84,225
  • 14
  • 89
  • 153
1

This is easy to do with awk

awk '/chapter/ {f=0} /chapter 1/ {f=1} f' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num

It will print the line if flag f is true.
The chapter 1 and next chapter to changes the flag.


You can use range with awk but its less flexible if you have other stuff to test.

awk '/chapter 1/,/chapter [^1]/ {if (!/chapter [^1]/) print}' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num
Jotne
  • 40,548
  • 12
  • 51
  • 55
  • assuming that the only thing we know is "chapter 1" and that somewhere after that there is a string "chapter" ( we done know the number of it ) ? – Giannis Tzagarakis Mar 21 '15 at 09:03
  • @GiannisTzagarakis Updated post to use next chapter of any number to stop the output – Jotne Mar 21 '15 at 09:04
  • just because I want to parse large files, awk seems to be the best solution. It is faster than sed or grep. Thanks! – Giannis Tzagarakis Mar 21 '15 at 09:21
  • btw, do you know how can pass the number of chapter as a bash script variable? $num=1 awk '/chapter/ {f=0} /chapter $num/ {f=1} f' file (awk -v n=$num '/chapter/ {f=0} /chapter n/ {f=1} f' file) is not working – Giannis Tzagarakis Mar 21 '15 at 10:34
  • Here is how `awk -v test="$var" '/chapter/ {f=0} $0~"chapter "test {f=1} f' file`. Then just set `var=2` and it will get chapter `2` – Jotne Mar 21 '15 at 14:29
1

You could do this through grep itself also but you need to enable Perl-regexp parameter P and z.

$ grep -oPz '^chapter 1[\s\S]*?(?=\nchapter)' file
chapter 1 blah blah
blah num blah num
num blah num blah
...
blah num

[\s\S]*? will do a non-greedy match of zero or more characters until the line which has the string chapter at the start is reached.

From man grep

-z, --null-data           a data line ends in 0 byte, not newline
-P, --perl-regexp         PATTERN is a Perl regular expression
-o, --only-matching       show only the part of a line matching PATTERN
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274