3

I've read a lot of this website to optimize my server but nothing has really helped me :( Im having a memory trouble I think , I have just one website (Drupal) with 860K page views month but when the traffic increase, the load average goes up over +40, +70 etc with 100% of memory use causing the server going totally down

Actually when I just have restarted the server the total memory usage is about 80%

I don't know what to do ! I really cant believe this server cant handle this kind of traffic, please help me!

Specs

Processor #1 to #24 
Intel Dual Xeon E5645 @ 2.40GHz
Cache 12288 KB
4GB Total RAM
Apache/2.2.19 -prefork- (Unix) mod_ssl/2.2.19 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod_bwlimited/1.4  PHP/5.2.17
500GB HD RAID 1
Drupal based website with Boost module and Cache Router (INNODB tables)
APC Installed

top (shift - m)

top - 23:05:37 up 19:42,  1 user,  load average: 0.78, 0.74, 0.64
Tasks: 527 total,   1 running, 524 sleeping,   0 stopped,   2 zombie
Cpu(s):  1.7%us,  0.3%sy,  0.0%ni, 97.9%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4034276k total,  3774668k used,   259608k free,   279060k buffers
Swap:  6088624k total,   103616k used,  5985008k free,  1316080k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                       
 8768 mysql     15   0 1211m 880m 4784 S 12.0 22.4  20:43.44 mysqld                                        
20475 nobody    16   0  429m 176m  27m S  0.0  4.5   0:00.44 httpd                                         
20846 nobody    15   0  427m 176m  28m S  1.0  4.5   0:01.13 httpd                                         
20775 nobody    15   0  422m 171m  29m S  0.0  4.4   0:01.22 httpd                                         
20826 nobody    15   0  422m 171m  29m S  0.7  4.4   0:01.00 httpd                                         
20827 nobody    15   0  423m 171m  28m S  0.7  4.4   0:00.61 httpd                                         
20578 nobody    15   0  422m 171m  29m S  0.0  4.3   0:01.73 httpd                                         
20833 nobody    15   0  422m 170m  28m S  0.0  4.3   0:00.84 httpd                                         
20830 nobody    15   0  421m 170m  28m S  0.0  4.3   0:00.84 httpd                                         
20681 nobody    15   0  422m 170m  28m S  1.0  4.3   0:00.93 httpd                                         
20913 nobody    15   0  422m 170m  27m S  0.0  4.3   0:00.34 httpd                                         
20914 nobody    15   0  422m 169m  27m S  0.0  4.3   0:00.60 httpd                                         
20854 nobody    15   0  423m 167m  23m S  0.0  4.2   0:00.36 httpd                                         
20911 nobody    16   0  418m 167m  28m S  0.3  4.2   0:00.70 httpd 

httpd.conf

Timeout 300
TraceEnable On
ServerSignature Off
ServerTokens Full
FileETag All
StartServers 5
<IfModule prefork.c>
MinSpareServers 5
MaxSpareServers 10
</IfModule>
ServerLimit 256
MaxClients 150
MaxRequestsPerChild 800
KeepAlive On
KeepAliveTimeout 5
MaxKeepAliveRequests 100

my.cnf

[mysqld]
max_connections = 120
safe-show-database
skip-locking
key_buffer = 148M
max_allowed_packet = 14M
table_cache = 596
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 2M
myisam_sort_buffer_size = 64M
thread_cache_size = 24
query_cache_size= 128M
thread_concurrency = 48
wait_timeout = 45
innodb_file_per_table
innodb_log_file_size = 10485760
open_files_limit = 8192
tmp_table_size=200M
max_heap_table_size=200M
innodb_buffer_pool_size=596M
local-infile=1
log_slow_queries = /var/log/slow.log
long_query_time = 3

[mysqldump]
quick
max_allowed_packet = 16M

[mysqld_safe]
log-error=/var/log/mysqld.log

[mysql]
no-auto-rehash

[isamchk]
key_buffer = 128M
sort_buffer_size = 64M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer = 128M
sort_buffer_size = 64M
read_buffer = 2M
write_buffer = 2M

Some graphs (this week)

Update:

Top with server load over 200

top - 12:27:13 up 5 days,  9:04,  1 user,  load average: 219.36, 189.93, 130.56
Tasks: 750 total,   1 running, 749 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.3%us,  1.0%sy,  0.1%ni, 49.7%id, 47.8%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   4034276k total,  4014052k used,    20224k free,    13404k buffers
Swap:  6088624k total,  3036872k used,  3051752k free,    71272k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                       
15653 mysql     15   0 1444m 154m 3384 S  0.0  3.9 192:42.76 mysqld                                        
23800 nobody    16   0  422m  96m  29m D  0.3  2.4   0:06.96 httpd                                         
23527 nobody    16   0  422m  93m  29m D  0.0  2.4   0:05.11 httpd                                         
23759 nobody    15   0  421m  91m  26m S  0.7  2.3   0:03.97 httpd                                         
23842 nobody    16   0  422m  91m  26m D  0.3  2.3   0:04.88 httpd                                         
23819 nobody    16   0  421m  91m  26m D  0.3  2.3   0:07.11 httpd                                         
23739 nobody    16   0  421m  91m  26m D  0.0  2.3   0:10.27 httpd                                         
23778 nobody    15   0  421m  91m  26m S  0.0  2.3   0:04.81 httpd                                         
23790 nobody    15   0  421m  91m  26m S  0.0  2.3   0:03.86 httpd                                         
23754 nobody    16   0  421m  91m  26m D  0.0  2.3   0:08.19 httpd                                         
23700 nobody    16   0  421m  90m  26m D  0.3  2.3   0:05.45 httpd                                         
23843 nobody    16   0  420m  90m  26m S  0.0  2.3   0:06.39 httpd                                         
23510 nobody    16   0  426m  90m  24m D  0.0  2.3   0:04.98 httpd                                         
23841 nobody    16   0  416m  89m  29m D  0.0  2.3   0:03.53 httpd                                         
23836 nobody    15   0  414m  89m  30m S  0.0  2.3   0:05.82 httpd                                         
23849 nobody    15   0  418m  88m  25m S  0.0  2.3   0:05.78 httpd                                         
23833 nobody    16   0  429m  88m  24m D  0.0  2.3   0:05.59 httpd                                         
23832 nobody    16   0  418m  88m  25m S  0.0  2.2   0:09.25 httpd                                         
23746 nobody    16   0  428m  88m  25m D  0.0  2.2   0:04.13 httpd                                         
23851 nobody    16   0  428m  88m  24m D  0.0  2.2   0:03.60 httpd                                         
23816 nobody    15   0  418m  88m  25m S  0.0  2.2   0:07.00 httpd                                         
23282 nobody    15   0  416m  87m  28m S  0.0  2.2   0:11.29 httpd                                         
23742 nobody    15   0  416m  86m  26m S  0.0  2.2   0:09.37 httpd                                         
23837 nobody    16   0  425m  86m  25m D  0.3  2.2   0:05.20 httpd                                         
23093 nobody    16   0  430m  86m  24m D  0.0  2.2   0:04.19 httpd                                         
23732 nobody    16   0  421m  86m  24m D  0.0  2.2   0:05.55 httpd                                         
23772 nobody    15   0  415m  85m  29m S  1.0  2.2   0:14.55 httpd   

Someone helped me to tweak Apache settings, but everything looks the same

I have enabled piped logging which should help with the memory issue. I have also shortened the amount of requests an apache process will do before it is cycled through memory.

I will really appreciate your help, I tried almost everything, I'm not really an sysadmin but we have not anyone right now to help us.

Thank you!

atom
  • 41
  • 1
  • 6
  • Your current TOP output indicates you have about 2GB available to be used by programs. See http://www.linuxatemyram.com/. Can you post the output when it is slow and has high load? – Zoredache Sep 15 '11 at 05:42
  • Thanks for the link. The free -m command shows 2gb- I have not an snapshot when the server load has high load :S Do you think the my.cnf and httpd.conf is ok ? – atom Sep 15 '11 at 05:47
  • 2
    Looks to me like you're about 1000MB or a Gig or so into swap space, and have periods of high table locks. When you get a large traffic surge you're likely getting waaayyy into the swap, which grinds everything to a halt, since now you're disk service times are going to shoot up, as they're busy thrashing. That, coupled with a ton of apache connections, is where your high load is coming from, and the ultimate melt down. You seem to know the answer: add more RAM! 4GB is NOTHING these days. I try to keep enough RAM in my machines to never swap. – Kendall Sep 15 '11 at 05:47
  • @Kendall So adding more memory would help ? some people told me is a thing of configuration but I will reconsider this based in your comment. One more thing, the top shows 176m 27m for RES and SHR for httpd , from what I've readin this is too high, what do you think ? – atom Sep 15 '11 at 05:51
  • Note that I haven't analyzed your config files or anything, mostly because I work with Postgres and not MySQL, and right now don't have the time. But, my comment above comes from looking at your graphs and your post, and then extrapolating out from there. grep /var/log/messages for "oom_killer". Got any results? – Kendall Sep 15 '11 at 05:52
  • I think more RAM would help, yes, but without seeing system stats where it's under high load, I'm mostly making an educated guess. – Kendall Sep 15 '11 at 05:57
  • @Kendall Thank you! I'll run a top command the next time it happens. - The grep /var/log/messages is still running. – atom Sep 15 '11 at 06:00
  • @Kendall _grep "oom_killer" /var/log/messages_ did'nt get any results – atom Sep 15 '11 at 06:17
  • @Zoredache I just posted a top when the server has high load. thank you. – atom Sep 19 '11 at 22:51
  • Sorting that top output by memory doesn't seem to have be all that useful. All the processes which are displayed are nearly idle, what process had a large value for CPU? It could be that you just have Apache set too high. During the loaded top you have 750 tasks compared to the 527 tasks in the 'normal' state. I would suggest that you decrease the MaxClients value – Zoredache Sep 19 '11 at 23:02
  • @Zoredache it seems the traffic increased today, but I don't know if each task is a conecction from a user ¿? Right now there are 560 tasks- load average 2.23 and the large value for CPU is `9044 mysql 15 0 1352m 756m 4352 S 19.2% *cpu* 19.2 3:21.42 /usr/sbin/mysqld --basedir=/ --datadir=/var/li` and an httpd process with 11.2% cpu and then another httpd process with 10% – atom Sep 19 '11 at 23:09

4 Answers4

6

General rule of thumb -

Run this command when server is under load:

ps -ylC httpd --sort:rss | awk '{sum+=$8; ++n} END {print "Tot="sum"("n")";print "Avg="sum"/"n"="sum/n/1024"MB"}'

That will tell you approximately the average size of an Apache process.

This is not a dedicated web node so say 60% of the RAM is available for Apache.

4096 * .60 / AVERAGE_SIZE_HTTPD_PROCESS = approximately the # of MaxClients/requests you can service. Generally looks like your average is around 170MB so...

You can service 14 requests at a time. Reduce MaxClients to a more reasonable number like 20-25.

Cheers

HTTP500
  • 4,833
  • 4
  • 23
  • 31
  • Thank you! right now I got this `Tot=3540788(24) Avg=3540788/24=144.075MB` but there is no high load. I'll be monitoring the server to run this command when the load increase. – atom Sep 15 '11 at 17:30
  • 1
    Okay, but whether it's 144MB or 170MB do you realize that the size of the average process is huge? Have you looked at what modules you're loading in Apache (e.g. mod_perl, mod_python, etc.)? We have a pretty heavy Drupal site with 120 active Drupal modules and the average Apache process size is ~44MB. – HTTP500 Sep 15 '11 at 19:44
  • Could you help me to see if I need to disable some modules please ? - I ran this command `/etc/init.d/httpd -M` and got these results http://dpaste.com/hold/615665/ and `/etc/init.d/httpd -l` results http://dpaste.com/hold/615667/ Thank you!! – atom Sep 15 '11 at 22:14
  • @atom, that list of loaded Apache modules doesn't look all that long... I can't really tell you which are candidates for disabling - it depends upon your application. Perhaps the frontpage_module? At any rate the answer probably lies elsewhere to why your Apache processes are so large... What is the output of this SQL query re: your Drupal DB: select count(*) from system where type='module' and status=1; – HTTP500 Sep 15 '11 at 23:19
  • 124 modules - this is a brief list http://dpaste.com/hold/615217/ – atom Sep 16 '11 at 00:52
  • 1
    @atom, Well 124 is quite a lot!, just as it is in our site :). FWIW, if your site is heavy on javascript and images I've seen impressive results with the Drupal HeadJS module in testing... – HTTP500 Sep 16 '11 at 02:09
  • Thank you! I'm still believing my problem is something related to apache / mysql or bad code but of course I'll install head.js, from what I've read it surely helps. Do you have any other advice that could help me ? Regards – atom Sep 16 '11 at 03:21
  • I asked someone to help me and he said he _have enabled piped logging and also shortened the amount of requests an apache process will do before it is cycled through memory._ but nothing changed :( – atom Sep 16 '11 at 13:13
  • @atom, Well there is a problem with Apache. The average process size is huge. And the current configuration of MaxClients = 150 is part of the problem. You're basically telling your Apache Server that it can service the load cause it has 150 * 170MB = 25500 MB of RAM. You basically need more RAM and/or add more web nodes. As far as MySQL goes have you checked the slow query log? Drupal can also result in temporary tables created on disk which can be a performance killer. You can check with: mysqladmin -u root -p ext -ri 30 | grep Created_tmp_disk – HTTP500 Sep 16 '11 at 15:38
  • Hi! sorry for the delay, I ran that command and got a lot of `| Created_tmp_disk_tables | 43584 | | Created_tmp_disk_tables | 64 | | Created_tmp_disk_tables | 55 | | Created_tmp_disk_tables | 47 | | Created_tmp_disk_tables | 52 | | Created_tmp_disk_tables | 62 | .. etc` could be this the cause of my problem ? – atom Sep 18 '11 at 17:01
  • @atom, Yes, it is probably part of the problem. If you can't solve the root cause with the queries (see: http://dev.mysql.com/doc/refman/5.0/en/internal-temporary-tables.html) you can put MySQL's tmpdir on a RAM disk. – HTTP500 Sep 18 '11 at 23:14
  • Thank you for the link! I checked my custom modules and some of them create temporary tables, I gonna optimize these, maybe it could hep! Today we had like 5 outages again :(( – atom Sep 19 '11 at 22:54
1

Gues what??

mod_security module was the cause of the memory trouble, I disabled it and every httpd process went from 180 to 35mb !

It was installed and configured by my hosting provider since the beginning, now I need to increase security in other way or configure it properly.

atom
  • 41
  • 1
  • 6
0

Are you absolutely sure you need CacheRouter module? For more than once it has caused me similar problems. It's a memory hog (at least when used with memcached), and if not correctly configured can make your site very sluggish!

With the traffic you described I think you could actually very well live without CacheRouter, Boost can be handy though. Without CacheRouter your Apache memory usage may come down quite dramatically and give your server much more room to breathe. Also, if you are using memcached PHP module, disable it. Your Apache seems to be eating way too much memory.

Also your Apache settings are weird.

  • TimeOut 300 is way too much. Drop it to something between 10 and 30.
  • TraceEnable On? Why?
  • Sometimes KeepAlive On can cause more harm than do good. Have you tried without it?
Janne Pikkarainen
  • 31,852
  • 4
  • 58
  • 81
  • I just installed the _CacheRouter_ module to see if it would help, the memory issue was before this. - I just have APC installed, not memcached Let me change the Timeout and traceenable to see what happens – atom Sep 15 '11 at 06:09
  • How is your APC configured, then? Perhaps it eats the memory for some reason. Anyway even though op-code caches such as APC are generally recommended, be aware they can also cause some side-effects and with your kind of traffic and your beefy server hardware I'm not 100% sure you actually need that. – Janne Pikkarainen Sep 15 '11 at 06:17
  • Thank you for your time! APC configuration `apc.enabled =1 apc.optimization = 0 apc.shm_size = 96M apc.enable_cli = 1 apc.include_once_override = 1 apc.ttl = 0 apc.user_ttl=7200 apc.apc.stat = off` - I also installed APC when the server started to cause problems and I think that helped – atom Sep 15 '11 at 06:21
  • OK. The fact is that _something_ is causing your Apache consume way more memory than it probably should and possibly making also your site sluggish. Can you list the PHP modules you have in use (post your php.ini or show us the output of `phpinfo();`? Please also list the Drupal modules you use; for example a 3rd party module showing the number of RSS subscribers is very heavy. Do you have Drupal 6 or 7? – Janne Pikkarainen Sep 15 '11 at 06:28
  • Im using Drupal 6 - This is the phpinfo output http://dl.dropbox.com/u/33784/info.html , let me list the drupal modules Im using it. Thank you. – atom Sep 15 '11 at 06:33
  • Modules installed are: http://dpaste.com/hold/615217/ – atom Sep 15 '11 at 06:41
  • btw I also have some own modules with just queries, can I show you in private ? – atom Sep 15 '11 at 06:43
  • Nothing too special installed, although there are couple of modules I have not used (Nodequeue, Smartqueue taxonomy, Elysiacron and DB Tuner). How often does your Elysia Cron run and what does it do for you? Perhaps some background event is slowly eating the resources ... – Janne Pikkarainen Sep 15 '11 at 06:45
  • Show me the modules in private :) – Janne Pikkarainen Sep 15 '11 at 06:46
  • Elysia cron help me to schedule the cron jobs from Drupal "crontab-like scheduling configuration of each job" http://drupal.org/project/elysia_cron - Like you said something is eating resources but I still dont know which one, btw this is a server Liquid web just "optimized" – atom Sep 15 '11 at 06:48
  • It seems i cant send "private messages" or what is called here :S – atom Sep 15 '11 at 06:51
  • See my profile for my e-mail address. – Janne Pikkarainen Sep 15 '11 at 06:52
  • I cant see your email but your website :D so I sent you an email to j...@mi..i.fi Thanks!! – atom Sep 15 '11 at 07:00
  • Just replied to your e-mail. :) – Janne Pikkarainen Sep 15 '11 at 07:14
  • Thank you , that will help me !! what others things do you think I must check to solve the problem ? db tables ? mysql ? :S – atom Sep 15 '11 at 07:20
  • Try the modification I suggested first, let's see if it brings down the memory usage. – Janne Pikkarainen Sep 15 '11 at 07:22
  • OK! I have like 10 modules to modify so maybe it would take a while, when I finish I'll be back here ok ? :D. Thank you for your time. – atom Sep 15 '11 at 07:25
  • Ten modules having the same issue? Wow, no wonder if your site is slow. Come back after the modifications :-) – Janne Pikkarainen Sep 15 '11 at 07:29
  • hi! I did the changes to all my custom modules and have been monitoring but everything is like before, every apache process is consuming like 140-170 mb :( – atom Sep 18 '11 at 17:03
  • Then it must be due the high number of Drupal modules you have in use; somewhere you mentioned over 100 but you did not list nearly as many for me. Those modules quickly add up. – Janne Pikkarainen Sep 18 '11 at 17:33
  • because there are 'submodules' (or whatever they call) for example I mentioned to you "token and submodules" which are like 6 in total , so thats why the list seems short. Thank you btw now I disabled some of them. – atom Sep 18 '11 at 17:44
0

Not an answer to this specific problem, but this was the page I kept finding when I was trying to solve a similar problem. So hoping to help another lamp administrator....

In my case, my problem was PHP's get_browser() command. After I installed browscap.ini (standard version) my apache process's memory usage went from 10Mb to 170Mb. This ran fine until I had a sudden peak of activity. Changing to the Lite version of browscap.ini put me back to the more acceptable 10mb.

MortimerCat
  • 101
  • 2