0

Starting with number 1, how to add this number in front of the direct following next x lines, followed by number 2 also for the following next x lines, and so on, end must be by y lines, how to do that?

1 jeriro ieieie ieiue
1 ieirirp wzwezeg
1 ieueujueu ueuuwiuyh
2 iejejrökx lek
2 kejejhejhe pmys
2 krejrjhrjh hegehe
3 ririrjfjf
3 iririr iezete
3 pgogto

4 Answers4

2

Hope it helps !!!

-bash-4.1$ awk -v a=3 -v b=1 'c<a{print b $0; c+=1} c==a{c=0;b+=1}' file
1asdasdasd
1asdas
1asd
2asd
2asd
2asd
3as
3asd
-bash-4.1$ awk -v a=4 -v b=1 'c<a{print b $0; c+=1} c==a{c=0;b+=1}' file
1asdasdasd
1asdas
1asd
1asd
2asd
2asd
2as
2asd

EDIT After testing two different codes (this one and the andswered by @Ed Morton), i´ve observed quite significant differences in performance between them:

$ cat lanza.sh 
date
awk -v x=3 '{print (NR%x?c+1:++c), $0}' file.dat > file.dat1
date
awk -v a=3 -v b=1 'c<a{print b"  "$0; c+=1} c==a{c=0;b+=1}' file.dat > file.dat2
date
./lanza.sh
EXEC1 
jueves,  7 de mayo de 2015, 22:01:17 CEST
jueves,  7 de mayo de 2015, 22:02:41 CEST
jueves,  7 de mayo de 2015, 22:04:09 CEST
EXEC2 (REVERSE ORDER FOR AWKS IN lanza.sh) 
jueves,  7 de mayo de 2015, 22:07:56 CEST
jueves,  7 de mayo de 2015, 22:09:24 CEST
jueves,  7 de mayo de 2015, 22:11:01 CEST
EXEC3 (REVERSE ORDER FOR AWKS IN lanza.sh) 
jueves,  7 de mayo de 2015, 22:12:14 CEST
jueves,  7 de mayo de 2015, 22:13:57 CEST
jueves,  7 de mayo de 2015, 22:15:20 CEST

$ wc -l file.dat
 30522352 file.dat
$ wc -l file.dat1
 30522352 file.dat1
$ wc -l file.dat2
 30522352 file.dat2

As can be seen , the different performance of two codes is about 5%-10% better with module (%) operator (with @Ed Morton´s code). Maybe more checking is needed, but the difference is quite significant at the first try!

equal or minus comparator times -> 1m28s, 1m37s, 1m43s

module (%) comparator times -> 1m24s, 1m28s, 1m33s
  • 1
    You might want to use the UNIX `time` utility (e.g. `time awk '...' file`) instead of printing dates before/after each command and run each command 3 times before recording the time to remove any caching issues. – Ed Morton May 07 '15 at 20:44
2
$ awk -v x=3 '{print (NR%x?c+1:++c), $0}' file
1 jeriro ieieie ieiue
1 ieirirp wzwezeg
1 ieueujueu ueuuwiuyh
2 iejejrökx lek
2 kejejhejhe pmys
2 krejrjhrjh hegehe
3 ririrjfjf
3 iririr iezete
3 pgogto

$ awk -v x=4 '{print (NR%x?c+1:++c), $0}' file
1 jeriro ieieie ieiue
1 ieirirp wzwezeg
1 ieueujueu ueuuwiuyh
1 iejejrökx lek
2 kejejhejhe pmys
2 krejrjhrjh hegehe
2 ririrjfjf
2 iririr iezete
3 pgogto
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • just to comment. I´ve also posted a different solution, with not module function. I assumed a very large file could be processed, then the operation for NR%x will always consume more resources than the c – Alejandro Teixeira Muñoz May 07 '15 at 13:12
  • Whether you use the modulus operator or some other math op will make [close to] no difference to performance - it will run in the blink of an eye either way. – Ed Morton May 07 '15 at 13:50
  • 1
    I´ve just made an edit in my answer. I´ve checked both codes and yours seems to be faster :) Nice !! thanks!! – Alejandro Teixeira Muñoz May 07 '15 at 20:38
0
perl -ne 'print 1+int(($.-1)/4) ," $_"'

or

perl -ne 'printf "%d %s", (3+$.)/4 ,$_'
JJoao
  • 4,891
  • 1
  • 18
  • 20
0
awk -v Cycle=3 '{print int((NR+Cycle-1)/Cycle), $0}' YourFile
# or
awk -v Cycle=3 '{printf '%d %s\n' (NR+Cycle-1)/Cycle, $0}' YourFile

#simplified for 3
awk '{print int((NR+2)/3), $0}' YourFile
NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43