2

I'm trying to track io usage of a few "suspect" processes over time. Ideally would like to make those stats available via munin, or push directly to graphite. However, as a starting point it would be great to be able to track this usage over time in a reasonably space/resource-efficient way. And then be able to retrieve, say, top 10 IO consumers and their data read/write stats over a period of time.

pidstat -d 2 seems to produce great output of the top io consuming processes every few seconds. And I understand that sar has some built-in archiving, but I'm not sure how to make it archive the stats that I get on pidstat (maybe in the SA1_OPTIONS? I'm on debian/ubuntu).

Other than piping pidstat to disk and then running some collection / aggregation over the output, are there any way to do the same thing (but more efficiently) with sar and then retrieve the stats later?

Yoav Aner
  • 561
  • 2
  • 6
  • 13

1 Answers1

1

came up with this ad-hoc bash script that forwards pidstat info (per-process IO read and write) to graphite via carbon

#!/bin/bash

hostname=`hostname -s`    
carbon_host=YOUR_CARBON_HOSTNAME_OR_IP
carbon_port=2003

pidstat -h -d 1 | grep --line-buffered -v '^$' | grep --line-buffered -v '^#' | grep --line-buffered -v '^Linux' | awk --assign=hostname=${hostname} '{ printf "servers.%s.pidstat.%s.read %s %s\nservers.%s.pidstat.%s.write %s %s\n", hostname, $6, $3, $1, hostname, $6, $4, $1 ; fflush(); }' > /dev/tcp/${carbon_host}/${carbon_port}
Yoav Aner
  • 561
  • 2
  • 6
  • 13