13

Using sed or similar how would you extract lines from a file? If I wanted lines 1, 5, 1010, 20503 from a file, how would I get these 4 lines?

What if I have a fairly large number of lines I need to extract? If I had a file with 100 lines, each representing a line number that I wanted to extract from another file, how would I do that?

Jon Seigel
  • 12,251
  • 8
  • 58
  • 92
monkeyking
  • 6,670
  • 24
  • 61
  • 81

6 Answers6

17

Something like "sed -n '1p;5p;1010p;20503p'. Execute the command "man sed" for details.

For your second question, I'd transform the input file into a bunch of sed(1) commands to print the lines I wanted.

Steve Emmerson
  • 7,702
  • 5
  • 33
  • 59
6

with awk it's as simple as:

awk 'NR==1 || NR==5 || NR==1010' "file"
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
ennuikiller
  • 46,381
  • 14
  • 112
  • 137
3

@OP, you can do this easier and more efficiently with awk. so for your first question

awk 'NR~/^(1|2|5|1010)$/{print}' file

for 2nd question

awk 'FNR==NR{a[$1];next}(FNR in a){print}' file_with_linenr file
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • The second response is a bit obfuscated. To explain: `FNR==NR` will occur only when reading `file_with_linenr`, not `file`. In this case, the text of the line is added to a set `a`, and execution skips to the next line of input. Thus when reading from `file`, only the `(FNR in a)` case applies, and prints the text of the relevant line if its number was put in `a` in parsing `file_with_linenr`. – joeln Jun 08 '14 at 11:32
1

This ain't pretty and it could exceed command length limits under some circumstances*:

sed -n "$(while read a; do echo "${a}p;"; done < line_num_file)" data_file

Or its much slower but more attractive, and possibly more well-behaved, sibling:

while read a; do echo "${a}p;"; done < line_num_file | xargs -I{} sed -n \{\} data_file

A variation:

xargs -a line_num_file -I{} sed -n \{\}p\; data_file

You can speed up the xarg versions a little bit by adding the -P option with some large argument like, say, 83 or maybe 419 or even 1177, but 10 seems as good as any.

*xargs --show-limits </dev/null can be instructive

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
0

I'd investigate Perl, since it has the regexp facilities of sed plus the programming model surrounding it to allow you to read a file line by line, count the lines and extract according to what you want (including from a file of line numbers).

my $row = 1
while (<STDIN>) {
   # capture the line in $_ and check $row against a suitable list.
   $row++;
}
Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
  • and you can use perl -e 'perlcode here' from the command prompt. Perl also has a range operator .. as in 3..12 which will allow you to create a list of numbers where needed. – Christian V Jan 06 '10 at 23:22
  • You should be using `$.`, which automagically contains the current line number – Hasturkun Jan 06 '10 at 23:35
  • Anybody interested in Perl command line techniques might want to look at Minimal Perl, from Manning... http://manning.com/maher/ – Joe Internet Jan 07 '10 at 06:36
0

In Perl:

perl -ne 'print if $. =~ m/^(1|5|1010|20503)$/' file
ire_and_curses
  • 68,372
  • 23
  • 116
  • 141