Shell script numbering lines in a file

Question

I need to find a faster way to number lines in a file in a specific way using tools like awk and sed. I need the first character on each line to be numbered in this fashion: 1,2,3,1,2,3,1,2,3 etc.

For example, if the input was this:

line 1
line 2
line 3
line 4
line 5
line 6
line 7

The output needs to look like this:

1line 1
2line 2
3line 3
1line 4
2line 5
3line 6
1line 7

Here is a chunk of what I have. $lines is the number of lines in the data file divided by 3. So for a file of 21000 lines I process this loop 7000 times.

export i=0
while [ $i -le $lines ]
do
    export start=`expr $i \* 3 + 1`
    export end=`expr $start + 2`
    awk NR==$start,NR==$end $1 | awk '{printf("%d%s\n", NR,$0)}' >> data.out
    export i=`expr $i + 1`
done

Basically this grabs 3 lines at a time, numbers them, and adds to an output file. It's slow...and then some! I don't know of another, faster, way to do this...any thoughts?

Bill Karwin · Answer 1 · 2017-01-10T16:06:04.200

16

Try the nl command.

See https://linux.die.net/man/1/nl (or another link to the documentation that comes up when you Google for "man nl" or the text version that comes up when you run man nl at a shell prompt).

The nl utility reads lines from the named file or the standard input if the file argument is ommitted, applies a configurable line numbering filter operation and writes the result to the standard output.

edit: No, that's wrong, my apologies. The nl command doesn't have an option for restarting the numbering every n lines, it only has an option for restarting the numbering after it finds a pattern. I'll make this answer a community wiki answer because it might help someone to know about nl.

edited Jan 10 '17 at 16:06

answered Dec 08 '08 at 20:12

Bill Karwin

538,548
86
673
828

1

Love the Unix tools attempt at an answer in a scripting question. There is also "cat -n" as a less polished nl. And for the reflective student of sed, the following can be modified to get the exact answer desired: http://www.gnu.org/software/sed/manual/sed.html#cat-_002dn – jaredor Dec 12 '08 at 16:28
@jaredor you should add that as an answer! – Pithikos Jun 02 '16 at 11:01
The rt.com link has become stale. – crw Jan 10 '17 at 13:48

score 10 · Accepted Answer · answered Dec 08 '08 at 20:19

It's slow because you are reading the same lines over and over. Also, you are starting up an awk process only to shut it down and start another one. Better to do the whole thing in one shot:

awk '{print ((NR-1)%3)+1 $0}' $1 > data.out

If you prefer to have a space after the number:

awk '{print ((NR-1)%3)+1, $0}' $1 > data.out

score 2 · Answer 3 · answered Dec 08 '08 at 20:09

2

Perl comes to mind:

perl -pe '$_ = (($.-1)%3)+1 . $_'

should work. No doubt there is an awk equivalent. Basically, ((line# - 1) MOD 3) + 1.

answered Dec 08 '08 at 20:09

derobert

49,731
15
94
124

perl -e 'printf "%d%s", (($.-1)%3)+1, $_' :-D – Jonathan Leffler Dec 08 '08 at 20:21
Jonathan: Why use printf? Seems like derobert's answer is more straightforward. – Jon 'links in bio' Ericson Dec 08 '08 at 20:25
Because using printf() doesn't modify $_; there might even be a time saving, though it is unlikely to be sufficient to worry about. – Jonathan Leffler Dec 08 '08 at 20:38
If you want optimization, print will probably be faster than printf. – derobert Dec 08 '08 at 21:03

score 2 · Answer 4 · answered Oct 24 '19 at 14:25

2

Another way is just to use grep and match everything. For example this will enumerate files:

grep -n '.*' <<< `ls -1`

Output will be:

1:file.a
2:file.b
3:file.c

answered Oct 24 '19 at 14:25

Dmitry

536
6
10

potong · Answer 5 · 2011-12-12T11:18:16.077

2

This might work for you:

 sed 's/^/1/;n;s/^/2/;n;s/^/3/' input

edited Dec 12 '11 at 11:18

answered Nov 21 '11 at 23:55

potong

55,640
6
51
83

score 1 · Answer 6 · answered Dec 08 '08 at 20:19

1

awk '{printf "%d%s\n", ((NR-1) % 3) + 1, $0;}' "$@"

answered Dec 08 '08 at 20:19

Jonathan Leffler

730,956
141
904
1,278

score 1 · Answer 7 · answered Dec 08 '08 at 20:23

1

Python

import sys
for count, line in enumerate(sys.stdin):
    stdout.write( "%d%s" % ( 1+(count % 3), line )

answered Dec 08 '08 at 20:23

S.Lott

384,516
81
508
779

score 1 · Answer 8 · answered Jan 04 '09 at 14:30

1

You don't need to leave bash for this:

i=0; while read; do echo "$((i++ % 3 + 1)) $REPLY"; done < input

answered Jan 04 '09 at 14:30

PEZ

16,821
7
45
66

This will not work in case the input ever contains leading space or backslashes. – Jens May 17 '12 at 15:41

score 0 · Answer 9 · answered Dec 10 '08 at 04:17

0

This should solve the problem. $_ will print the whole line.

awk '{print ((NR-1)%3+1) $_}' < input
1line 1
2line 2
3line 3
1line 4
2line 5
3line 6
1line 7

# cat input 
  line 1
  line 2
  line 3
  line 4
  line 5
  line 6
  line 7

answered Dec 10 '08 at 04:17

Ganesh M

3,666
8
27
25

Shell script numbering lines in a file

9 Answers9