2

I'm a newbie to Logstash and I'm using it for parsing 500MB of logfiles in a particular directory, currently when i'm start logstash it doen't show any progress bar for how much % it has completed parsing the log file. Is there any way to see the progress of log parsing done?

Avis
  • 988
  • 2
  • 11
  • 31

2 Answers2

5

No, Logstash has no built-in progress bar feature. Most of the time this wouldn't make any sense since Logstash is meant to process ever growing logs continuously, and then there isn't really any "done".

What you could do is correlate the contents of the sincedb file with the file size of the corresponding file. The sincedb file is where Logstash stores the current offset in a file. An exact description of the file format is found in the file input's documentation, but you mainly have to care about the first and last columns. The first column is the inode number, which can be also be found in the ls -li output for a file, and the last column is the current offset. Example:

393309 0 64773 437

Here, Logstash is at offset 437 for the file with inode 393309.

The join command can be used to join this file with the ls -li output (where the file's inode number is in the first column):

$ join /var/lib/logstash/.sincedb_f5fdf6ea0ea92860c6a6b2b354bfcbbc <(ls -li /var/log/syslog)
393309 0 64773 437 -rw-r----- 1 root adm 437 Oct 15 12:47 /var/log/syslog

Finally, awk can be used to clean up the output and produce a percent-completed number:

$ join /var/lib/logstash/.sincedb_f5fdf6ea0ea92860c6a6b2b354bfcbbc <(ls -li /var/log/syslog) | awk '{ printf "%-30s%.1f%\n", $13, 100 * $4 / $9 }'
/var/log/syslog               100.0%
Magnus Bäck
  • 11,381
  • 3
  • 47
  • 59
  • You can also use the metric{} filter to print some output while logstash is running (e.g. every 1,000 records), or count the records that exist in elasticsearch. – Alain Collins Oct 15 '15 at 13:47
  • note: to make logstash stop and quit after processing files, set `mode => "read"` and `exit_after_read => true` https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html – ricka Oct 28 '20 at 02:40
0

I've modified Magnus' script to also list the files that have not yet been parsed as 0.0%:

PATH_TO_SINCEDBS=/var/data/logstash/plugins/inputs/file
FILES_TO_BE_PARSED="/tmp/*.log /log/*.log /log/parsed/*.log"

tmpfile=$(mktemp); tmpfile2=$(mktemp)

sort ${PATH_TO_SINCEDBS}/.sincedb_* | awk '{ print $1" "$4 }' > ${tmpfile}
stat -c "%i %n %s" ${FILES_TO_BE_PARSED} | sort > ${tmpfile2}

join ${tmpfile2} ${tmpfile} -a 1 | awk '{ printf "%-30s %.1f%\n", $2, 100 * $4 / $3 }'

rm -f ${tmpfile} ${tmpfile2}

Unfortunately when you're using the file input file with 'start_position => "beginning"' with Logstash 5, it does not write anything to the sincedb file until its done - or at least this is the behaviour that I'm getting.

Chris
  • 81
  • 1
  • 5