0

https://www.baeldung.com/linux/remove-last-n-lines-of-file

awk -v n=3 'NR==FNR{total=NR;next} FNR==total-n+1{exit} 1' input.txt input.txt 
01 is my line number. Keep me please!
02 is my line number. Keep me please!
03 is my line number. Keep me please!
04 is my line number. Keep me please!
05 is my line number. Keep me please!
06 is my line number. Keep me please!
07 is my line number. Keep me please!

Here is a way to remove the last n lines. But it is not done inplace and the file is read twice, and it only deals with one file at a time.

How can I inplace remove the last n lines of many files without opening them more than once with one gawk command but without using any other external commands?

user1424739
  • 11,937
  • 17
  • 63
  • 152
  • You could try like: `awk -v n="3" -v total=$(wc -l < Input_file) 'FNR==total-n+1{exit} 1' Input_file` which will help you to get lines with single run itself, cheers. – RavinderSingh13 Dec 30 '22 at 02:28
  • I just want to use gawk without using any other external commands. – user1424739 Dec 30 '22 at 02:29
  • 1
    Any particular reason to choose `awk`? Using `head -n -3` would likely be the fastest solution. You'll have to add code for inplace editing, but that would be similar to what `inplace` option does for you. – Sundeep Dec 30 '22 at 06:00
  • @Sundeep : you'll be surprised how little time savings there is with `head` (`N = 1639779`): `in0: 100MiB 0:00:00 [1006MiB/s] [1006MiB/s] [=> ] 10% ETA 0:00:00 out9: 715MiB 0:00:00 [ 748MiB/s] [ 748MiB/s] [ <=> ] in0: 988MiB 0:00:00 [1.07GiB/s] [1.07GiB/s] [======================>] 100% ( pvE 0.1 in0 < "${fn1}" | ghead -n -"$N"; ) 0.43s user 0.73s system 117% cpu 0.982 total 6fa4d6fbf7a4900db216024d220322c9 stdin` …... – RARE Kpop Manifesto Dec 30 '22 at 06:20
  • @Sundeep : ….. `in0: 338MiB 0:00:00 [3.30GiB/s] [3.30GiB/s] [=======> ] 34% ETA 0:00:00 out9: 715MiB 0:00:01 [ 684MiB/s] [ 684MiB/s] [ <=> ] in0: 988MiB 0:00:00 [3.28GiB/s] [3.28GiB/s] [======================>] 100% ( pvE 0.1 in0 < "${fn1}" | mawk2 -v N="$N" ; ) 0.30s user 0.68s system 90% cpu 1.071 total 6fa4d6fbf7a4900db216024d220322c9 stdin` – RARE Kpop Manifesto Dec 30 '22 at 06:20

4 Answers4

2

With your shown samples please try following awk code. Without using any external utilities as per OP's request in question. We could make use of END block here of awk programming.

awk -v n="3" '
{
  total=FNR
  lines[FNR]=$0
}
END{
  till=total-n
  for(i=1;i<=till;i++){
    print lines[i]
  }
}
' Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
2

single-pass awk solution that requires neither arrays nor gawk

— (unless your file is over 500 MB, then it might be slightly slower) :

rm -f file.txt

jot -c 30 51 > file.txt

gcat -n file.txt | rs -t -c$'\n' -C'#' 0 5 | column -s'#' -t

 1  3       7   9      13   ?      19   E      25   K
 2  4       8   :      14   @      20   F      26   L
 3  5       9   ;      15   A      21   G      27   M
 4  6      10   <      16   B      22   H      28   N
 5  7      11   =      17   C      23   I      29   O
 6  8      12   >      18   D      24   J      30   P
mawk -v __='file.txt' -v N='13' 'BEGIN { 

OFS = FS = RS
      RS = "^$"

getline <(__); close(__)
  
print $!(NF -= NF < (N+=_==$NF) ? NF : N) >(__) }'
gcat -n file.txt | rs -t -c$'\n' -C'#' 6 | column -s'#' -t ;


 1  3       7   9      13   ?
 2  4       8   :      14   @
 3  5       9   ;      15   A
 4  6      10   <      16   B
 5  7      11   =      17   C
 6  8      12   >

Speed is hardly a concern :

115K rows 198 MB file took 0.254 secs
rows       = 115567. | UTF8 chars = 133793410. | bytes      = 207390680.

( mawk2 -v __="${fn1}" -v N='13' ; )  
0.04s user 0.20s system 94% cpu 0.254 total
 
rows       = 115554. | UTF8 chars = 133779254. | bytes      = 207370006.
5.98 million rows 988 MB file took 1.44 secs
rows       = 5983333. | UTF8 chars = 969069988. | bytes      = 1036334374.

( mawk2 -v __="${fn1}" -v N='13' ; )
0.33s user 1.07s system 97% cpu 1.435 total
 
rows       = 5983320. | UTF8 chars = 969068062. | bytes      = 1036332426.
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11
1

Another way to do it, using GAWK, with option The BEGINFILE and ENDFILE Special Patterns:

{ lines[++numLines] = $0 }
BEGINFILE { fname=FILENAME}
ENDFILE { prt() }

function prt(   lineNr,maxLines) {
    close(fname)
    printf "" > fname
    maxLines = numLines - n
    for ( lineNr=1; lineNr<=maxLines; lineNr++ ) {
            print lines[lineNr] > fname
    }
    close(fname)
    numLines = 0
}
Luuk
  • 12,245
  • 5
  • 22
  • 33
1

I find that this is the most succinct solution to the problem.

$ gawk -i inplace -v n=3 -v ORS= -e '{ lines[FNR]=$0 RT }
ENDFILE {
    for(i=1;i<=FNR-n;++i) {
        print lines[i]
    }
}' -- file{1..3}.txt
user1424739
  • 11,937
  • 17
  • 63
  • 152