1

I have a RHEL 5 server that recently ran out of disk space and now our Logwatch for the server reports the following disk usage (I think this is the last accurate night before the /var partition filled up):

Filesystem            Size  Used Avail Use% Mounted on
 /dev/mapper/VolGroup00-LogVol00
                        62G  3.8G   55G   7% /
 /dev/mapper/VolGroup01-LogVol00
                       198G  185G  2.8G  99% /var
 /dev/cciss/c0d0p1      99M   24M   70M  26% /boot

If I log into the server and run df -h manually I get the following result:

Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       62G   14G   46G  23% /
/dev/mapper/VolGroup01-LogVol00
                      198G  174G   14G  93% /var
/dev/cciss/c0d0p1      99M   24M   70M  26% /boot

I checked /usr/share/logwatch/default.conf/logwatch.conf and found that the temp directory is /var/cache/logwatch but that directory contains no items. Does anyone know what would cause logwatch to display stale data like this?

Scott Keck-Warren
  • 1,670
  • 1
  • 14
  • 23
  • Data is obviously skewed. Run logwatch manually, or run your "comparison" at the exact same time the system runs its own. – Tim Jan 03 '12 at 16:49
  • @Tim, do you want to add your comment as an answer so I can accept it (it was the key piece I didn't think about doing). I'll put more information in for people who are curious in my own answer. :-) – Scott Keck-Warren Jan 09 '12 at 14:05
  • Done, like a good mentor of mine always says, "all it takes sometimes is a second pair of eyes" ;) – Tim Jan 09 '12 at 14:11

2 Answers2

1

Data is obviously skewed. Run logwatch manually, or run your "comparison" at the exact same time the system runs its own.

Tim
  • 3,017
  • 17
  • 15
0

@Tim asked the question that brought me down this path so I'm giving him credit for the correct answer.

The problem wasn't that the data was skewed but that there were a couple processes running that caused the used disk space to fluctuate wildly. This server is running six instances of Moodle that had staggered backups running throughout the night. Some of the backups were failing to complete and they didn't clean up their temporary files. It appears that another process comes along later and cleans up the temporary files and it happened somewhere between when the logwatch ran (4 AM) and when I check it manually (8 AM).

Scott Keck-Warren
  • 1,670
  • 1
  • 14
  • 23
  • 1
    I recommend running constant monitoring, which would make things like this more likely to get spotted. I currently use http://collectd.org/ for the data gathering and a slightly modified copy of http://haroon.sis.utoronto.ca/rrd/scripts/ to draw the pretty pictures (a graph from the df module would have helped you spot the space being used temporarily), though there are a number of other solutions out there (cacti, munin, zabbix, ...) so have a look around to make sure there isn't something better suited to your needs. – David Spillett Jan 09 '12 at 16:38