2

I monitored my disk on my server because my website is slow.

Here is my disk statistic:

Read: 0.29 reqs/s Write: 50.19 reqs/s

I don't understand why I have 50 requests by second on my disk in writing.

How can I find the program which write like a pig ? command line ? program ?

Update: The server is a debian etch. The disk is in a SAN, so it's a virtual disk.

My database is a mysql and my web site is in ruby on rails.

I have 1 GB ram.

Here is the result of the free command:

free -m   
             total       used       free     shared    buffers     cached
Mem:           995        769        225          0         46        421
-/+ buffers/cache:        301        693
Swap:         1906          0       1906

I don't think is a swap problem. I don't understand :(

Ok, after several search, I found that:

Postfix add 10 entries by second in my syslog file :s, the log is like this:

Feb 16 10:51:41 myhost postfix/local[24480]: 333902F1CE: to=<ovh@mail.monsite.com>, orig_to=<root>, relay=local, delay=73, delays=42/25/0/6.1, dsn=5.1.1, status=bounced (unknown user: "ovh")
Feb 16 10:51:41 myhost postfix/qmgr[3753]: 528032F1D3: removed
Feb 16 10:51:41 myhost postfix/cleanup[24624]: CEBAD2F1D4: message-id=<20100216095120.CEBAD2F1D4@mail.monsite.com>
Feb 16 10:51:41 myhost postfix/bounce[24575]: B8EE32F19B: sender non-delivery notification: DDE2D2F1DE
Feb 16 10:51:41 myhost postfix/qmgr[3753]: DDE2D2F1DE: from=<>, size=2798, nrcpt=1 (queue active)
Feb 16 10:51:41 myhost postfix/cleanup[25934]: 659B02F1D3: message-id=<20100216095138.659B02F1D3@mail.monsite.com>
Feb 16 10:51:41 myhost postfix/qmgr[3753]: B8EE32F19B: removed
Feb 16 10:51:41 myhost postfix/local[24948]: DDE2D2F1DE: to=<ovh@mail.monsite.com>, orig_to=<root@mail.monsite.com>, relay=local, delay=15, delays=12/2/0/1.3, dsn=5.1.1, status=bounced (unknown user: "ovh")
Feb 16 10:51:41 myhost postfix/bounce[24726]: 333902F1CE: sender non-delivery notification: 659B02F1D3
Feb 16 10:51:41 myhost postfix/qmgr[3753]: CEBAD2F1D4: from=<root@mail.monsite.com>, size=983, nrcpt=1 (queue active)
Feb 16 10:51:41 myhost postfix/qmgr[3753]: 333902F1CE: removed
Feb 16 10:51:41 myhost postfix/qmgr[3753]: 659B02F1D3: from=<>, size=2792, nrcpt=1 (queue active)
Feb 16 10:51:41 myhost postfix/qmgr[3753]: DDE2D2F1DE: removed
Feb 16 10:51:47 myhost postfix/local[24480]: 659B02F1D3: to=<ovh@mail.monsite.com>, orig_to=<root@mail.monsite.com>, relay=local, delay=8.7, delays=3.3/0/0/5.4, dsn=5.1.1, status=bounced (unknown user: "ovh")
Feb 16 10:51:47 myhost postfix/local[25978]: CEBAD2F1D4: to=<ovh@mail.monsite.com>, orig_to=<root>, relay=local, delay=32, delays=27/0/0/5.4, dsn=5.1.1, status=bounced (unknown user: "ovh")
Feb 16 10:51:47 myhost postfix/qmgr[3753]: 659B02F1D3: removed
Feb 16 10:51:47 myhost postfix/cleanup[24906]: 1A7512F19B: message-id=<20100216095147.1A7512F19B@mail.monsite.com>
Feb 16 10:51:53 myhost postfix/bounce[24726]: CEBAD2F1D4: sender non-delivery notification: 1A7512F19B
Feb 16 10:51:53 myhost postfix/qmgr[3753]: CEBAD2F1D4: removed
Feb 16 10:51:53 myhost postfix/qmgr[3753]: 1A7512F19B: from=<>, size=2798, nrcpt=1 (queue active)
Feb 16 10:51:59 myhost postfix/local[24948]: 1A7512F19B: to=<ovh@mail.monsite.com>, orig_to=<root@mail.monsite.com>, relay=local, delay=12, delays=6.2/0/0/6.1, dsn=5.1.1, status=bounced (unknown user: "ovh")
Feb 16 10:51:59 myhost postfix/qmgr[3753]: 1A7512F19B: removed
Feb 16 10:52:11 myhost /USR/SBIN/CRON[25984]: (root) CMD (/usr/local/rtm/bin/rtm 18 > /dev/null 2> /dev/null)
Feb 16 10:52:11 myhost /USR/SBIN/CRON[25985]: (root) CMD (wget -O /dev/null http://monsite.com/cron/desactive_arene)
Feb 16 10:52:16 myhost /USR/SBIN/CRON[25987]: (root) CMD (run-parts /usr/local/oco/bin/60sec >/dev/null 2>/dev/null)
Feb 16 10:52:22 myhost /USR/SBIN/CRON[25988]: (root) CMD (run-parts /usr/local/oco/bin/120sec >/dev/null 2>/dev/null)
Feb 16 10:52:39 myhost postfix/pickup[23034]: 8A8CC2F1BD: uid=0 from=<root>
Feb 16 10:52:44 myhost postfix/cleanup[24624]: 8A8CC2F1BD: message-id=<20100216095205.8A8CC2F1BD@mail.monsite.com>
Feb 16 10:52:44 myhost postfix/qmgr[3753]: 8A8CC2F1BD: from=<root@mail.monsite.com>, size=983, nrcpt=1 (queue active)
Feb 16 10:52:44 myhost postfix/local[24480]: 8A8CC2F1BD: to=<ovh@mail.monsite.com>, orig_to=<root>, relay=local, delay=21, delays=15/0/0/6.1, dsn=5.1.1, status=bounced (unknown user: "ovh")

And here, I don't understand what's happened.

sysadmin1138
  • 133,124
  • 18
  • 176
  • 300
Kiva
  • 155
  • 2
  • 8
  • 2
    Oh wow, how about some details please - machine, OS, storage, application - anything would help as you've provided almost no information here. – Chopper3 Feb 15 '10 at 18:57
  • Logging? Swap file? Database? – lrosa Feb 15 '10 at 19:10
  • And don't forget to tell us whether your disk is an actual physical disk, a virtual disk or a disk array (and what array configuration that is). – John Gardeniers Feb 15 '10 at 21:03

2 Answers2

1

Besides iotop as suggested above, you don't have a LOG_ALL enabled somewhere, do you?

Other probably helpful hints:

  • look for directories that have the highest number of files
  • or that are just awfully big in size compared to others
  • search for issues when swapping: do you have swap space? is it enabled? is your memory full, and is the machine thus trying to swap to disk continuously?
  • what if you stop some processes (like, the db). Is the behaviour changing? Perhaps what you need is not to monitor the perpetrator process, but instead to do some performance evaluation on the application (like, mysql) with the appropriate tools
  • lastly.. you sure it's not some sort of firewall logging mechanism due to a DoS attack?
lorenzog
  • 2,799
  • 3
  • 20
  • 24
  • So, I have optimized mysql there are one month ago and I think it's good now (very better performance). Otherwise, how can I know if the swap is activated ? I have 1Go Ram but 900Mo used. So how free memory without restart the server ? I use iptable as firewall. How can I find this log et how disabled it ? I'm not a good network admin, so sorry if my questions are bad :) – Kiva Feb 16 '10 at 08:19
  • I look my authlog and I see this: Feb 16 09:50:44 CRON[22565]: (pam_unix) session closed for user root Feb 16 09:50:47 CRON[22574]: (pam_unix) session closed for user root Feb 16 09:50:51 CRON[22573]: (pam_unix) session closed for user root All 3/4 second, a new line is added ? What is it ? – Kiva Feb 16 '10 at 08:46
  • ok, if it's really a cron job running every few seconds you might have a problem. Go check which job is running into /etc/cron* and your cron file (`crontab -e`). You can take a look at the swap state with 'free' and 'top'. You can also enable some logging in your iptables. Also check the size of the files in `/var/log/`..? – lorenzog Feb 16 '10 at 11:31
0

If it's a recent linux box, iotop is the program you want to run.

Joel K
  • 5,853
  • 2
  • 30
  • 34