-2

The server load is extremely high. I'm trying to diagnose what's going on. Screenshot attached.

edit: OS is CentOS 5. Not running anything like plesk or cpanel, etc.. The raid10 has 1 failed drive, the other 3 are good. It will be replaced soon. There have been no traffic spikes. The server load is usually in the 3-4 range. We added 4GB more, but no change. It is a physical server, yes. If apache is stopped the load is less than 1.0 yes.

enter image description here

Poe
  • 321
  • 1
  • 5
  • 18
  • You need to give us more information. What operating system are you runing, do you have some panel installed etc... – Luka Mar 14 '13 at 03:32
  • right, I just wasn't sure what additional information was needed. It's running CentOS 5. – Poe Mar 14 '13 at 03:33
  • Do you have installed panel, like cPanel... – Luka Mar 14 '13 at 03:33
  • no panels are installed – Poe Mar 14 '13 at 03:34
  • I would recommend doing *yum update* and then rebooting the server. Then check the load with TOP command. And install CSF firewall http://configserver.com/free/csf/install.txt you are maybe under DDoS attack... – Luka Mar 14 '13 at 03:36
  • 1
    You've really provided us with nothing here. Your two highest CPU using processes are related to disk IO. What's the health of your RAID array look like? Did you suddenly start seeing a lot of traffic for whatever site you are hosting? What's your memory usage look like? Is this actually a physical server? If you stop apache, does your load go away? – devicenull Mar 14 '13 at 03:48
  • 1
    @devicenull, right I understand and I should have mentioned that in my post. I'll fix that. I was going to add the data as it was requested, since I wasn't sure what to add at first. updating the post now. – Poe Mar 14 '13 at 03:52
  • The system CPU percentage is 99.8. That is really high. – Matthew Ife Mar 14 '13 at 07:49
  • 1
    Now when you edited the post: "The raid10 has 1 failed drive, the other 3 are good." everything is clear, lol – Luka Mar 14 '13 at 10:55

2 Answers2

3

You've got a failed RAID drive. Performance is pretty much expected to drop when that's the case. RAID10 saved your data, but you aren't going to see the same performance (as your essentially operating in RAID1 mode now, not RAID10). I'd expect that you'd see similar load averages until the drive is replaced and your array has fully rebuilt.

I'd worry about getting that drive replaced ASAP. It would not surprise me if you have another semi-failing drive in your RAID array.

devicenull
  • 5,622
  • 1
  • 26
  • 31
0

You need to tweak Apache configuration, it looks that many child processors are running. A better way to start with removing unwanted modules and tweak max connections and other parameters. you need to be careful with these parameters it will directly impact on apache load.

  • Could you elaborate on how to do this and be more specific? Which modules? What files? As it stands your answer isn't very helpful. Read the SF FAQ: http://www.serverfault.com/faq – slm Mar 14 '13 at 07:31