0

Size of log file: 10 GB

Free space in partition: 6 GB

Need to split log file into smaller pieces, and gzip those. But there's not enough room to run something like split(1), which leaves the original file intact. That would leave us with

Original log file: 10 GB

Output of split: another 10 GB

Is there a way to split the file inline, or to do something like this:

$ tail -nnn bigfile.txt > piece.txt

$ some-command -nnn bigfile.txt # just truncate last nnn lines

$ gzip piece.txt

(repeat)

Finding a utility like "some-command" would be ok too.

2 Answers2

4

Here's how I would do it -

Decide how many lines of the log you want in one split file. You also need to make sure you have free space equal to the split file size. Let's say 10000 lines just for the example.

First, copy the log lines into your target file

 head -n 10000 source.log | gzip -c > split001.log.gz

Next, use sed to do an in-place delete of the lines you just copied

 sed -i '1,10000d' source.log

With a little effort, you could wrap the above in a script that loops until the source file is empty, incrementing the split filename along the way.

--- EDIT ---

Ok, I'll save you the trouble -

 #!/bin/bash

 if [ $# -ne 3 ]
 then
    echo "usage:  $0 <source file> <target prefix> <N lines per chunk>"
    exit
 fi

 filename=$1
 target_stem=$2
 target_lines=$3

 numlines=`wc -l $filename | cut -f1 -d' '`
 count=1;
 while [ $numlines -gt 0 ] 
 do
    head -n $target_lines $filename | gzip -c > ${target_stem}${count}.txt.gz
    sed -i "1,${target_lines}d" $filename

    numlines=`wc -l $filename | cut -f1 -d' '`
    let "count = $count + 1"
 done
Bill B
  • 591
  • 2
  • 4
0

you could use the split command you can define bytes, size, lines:

split -db 10k bigfile.txt bigfile_

Note: the output file will be bigfile_N because we are usign the -d switch.

using the switch -l we can define amount of lines, example:

split -dl 1000 bigfile.txt bigfile_

You could than periodically stop the job (Ctrl+Z) and gzip the pieces, then use "fg" to resume the stopped job.

Prix
  • 4,881
  • 3
  • 24
  • 25