1

I know this has been asked multiple times, and generally wa causes high load even if CPU isn't the culprit. However, in our case, wa is ~0 and CPU is at ~45% on a 4 core, 8 thread dedicated server. But load average is 8.33, 8.47, 8.28

Here's a dump of top:

top - 11:16:41 up 139 days, 49 min,  1 user,  load average: 8.33, 8.47, 8.28
Tasks: 313 total,   5 running, 308 sleeping,   0 stopped,   0 zombie
%Cpu(s): 42.4 us, 13.4 sy,  0.0 ni, 36.9 id,  0.0 wa,  0.0 hi,  7.3 si,  0.0 st
KiB Mem:  32939280 total, 30515232 used,  2424048 free,   120408 buffers
KiB Swap:  1046520 total,    96056 used,   950464 free. 23932456 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                   
21382 root      20   0   70288  11360   2028 R 100.0  0.0  22406:05 supervisord                                                                                                                               
 1774 www-data  20   0  286472  30320   8052 S  28.6  0.1 295:05.54 php                                                                                                                                       
  309 www-data  20   0  286236  29948   8056 S  28.3  0.1 298:05.63 php                                                                                                                                       
32338 www-data  20   0  286296  30136   8056 S  27.3  0.1 300:02.28 php                                                                                                                                       
  913 www-data  20   0  286232  30080   8060 S  26.6  0.1 297:14.66 php                                                                                                                                       
 1692 www-data  20   0  286560  30276   8056 S  26.3  0.1 300:22.92 php                                                                                                                                       
32470 www-data  20   0  286608  30324   8052 S  26.3  0.1 296:58.59 php                                                                                                                                       
32029 www-data  20   0  286360  29952   8056 S  25.9  0.1 297:50.51 php                                                                                                                                       
 1083 www-data  20   0  286236  30076   8056 S  24.9  0.1 298:42.59 php                                                                                                                                       
32334 www-data  20   0  286604  30368   8052 R  24.6  0.1 300:45.32 php                                                                                                                                       
 1214 www-data  20   0  286224  29808   8060 S  24.3  0.1 290:03.53 php                                                                                                                                       
32027 www-data  20   0  286680  30404   8056 S  23.9  0.1 297:58.06 php                                                                                                                                       
 1700 www-data  20   0  286508  30096   8060 S  23.6  0.1 286:11.93 php                                                                                                                                       
25453 beansta+  20   0 1315140 1.242g    616 S  12.0  4.0 342:41.49 beanstalkd                                                                                                                                
10410 root      20   0   12240   7316   2096 S   3.7  0.0 360:16.80 flash_linux_amd                                                                                                                           
31498 www-data  20   0  275036  17868   7696 S   2.7  0.1  88:05.09 php                                                                                                                                       
31947 www-data  20   0  274772  17592   7644 S   2.3  0.1  86:49.37 php                                                                                                                                       
 1416 www-data  20   0  276064  19404   8096 S   2.0  0.1   0:20.29 php                                                                                                                                       
 1427 www-data  20   0  276616  20192   8088 S   2.0  0.1   0:18.78 php                                                                                                                                       
32021 www-data  20   0  275036  17876   7696 S   2.0  0.1  86:26.58 php                                                                                                                                       
32049 www-data  20   0  282144  25776   8088 S   2.0  0.1   0:36.40 php                                                                                                                                       
 1433 www-data  20   0  275740  19076   8100 S   1.7  0.1   0:17.99 php                                                                                                                                       
 1437 www-data  20   0  275752  19100   8104 S   1.7  0.1   0:17.95 php                                                                                                                                       
14251 www-data  20   0  280348  24008   8112 S   1.7  0.1   3:25.54 php                                                                                                                                       
15057 www-data  20   0  275996  19400   8116 R   1.7  0.1   3:11.02 php                                                                                                                                       
31680 www-data  20   0  282628  26120   8104 S   1.7  0.1   0:36.90 php                                                                                                                                       
31682 www-data  20   0  282920  26436   8100 S   1.7  0.1   0:38.47 php                                                                                                                                       
 1431 www-data  20   0  276000  19496   8104 S   1.3  0.1   0:18.67 php                                                                                                                                       
   70 root      20   0       0      0      0 S   1.0  0.0 353:48.51 ksoftirqd/6                                                                                                                               
19845 root      20   0   25620   1752   1116 S   0.7  0.0  59:12.64 top                                                                                                                                       
    7 root      20   0       0      0      0 R   0.3  0.0 216:11.85 rcu_sched                                                                                                                                 
   15 root      20   0       0      0      0 S   0.3  0.0  66:01.10 rcuos/7                                                                                                                                   
  230 root      20   0       0      0      0 S   0.3  0.0 186:54.85 md2_raid1                                                                                                                                 
16833 root      20   0       0      0      0 S   0.3  0.0   0:01.08 kworker/0:1                                                                                                                               
24417 www-data  20   0  349400  17600  13192 S   0.3  0.1   1:15.44 php5-fpm                                                                                                                                  
30203 root      20   0       0      0      0 S   0.3  0.0   0:00.53 kworker/2:2                                                                                                                               
    1 root      20   0   34172   2752   1228 S   0.0  0.0   3:29.38 init                                                                                                                                      
    2 root      20   0       0      0      0 S   0.0  0.0   0:03.85 kthreadd                                                                                                                                  
    3 root      20   0       0      0      0 S   0.0  0.0  66:10.86 ksoftirqd/0                                                                                                                               
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H                                                                                                                              
    8 root      20   0       0      0      0 S   0.0  0.0 101:00.72 rcuos/0                                                                                                                                   
    9 root      20   0       0      0      0 S   0.0  0.0  75:26.50 rcuos/1                                                                                                                                   
   10 root      20   0       0      0      0 S   0.0  0.0  66:02.02 rcuos/2                                                                                                                                   
   11 root      20   0       0      0      0 S   0.0  0.0  80:03.94 rcuos/3                                                                                                                                   
   12 root      20   0       0      0      0 S   0.0  0.0  26:27.47 rcuos/4                                                                                                                                   

At what load average should I be worried (have Nagios wake me up) for this machine?

Keith
  • 4,637
  • 15
  • 25
kouton
  • 189
  • 1
  • 9

2 Answers2

1

I think your load average is OK.

Take a look to this article to understand Load metric and how it's calculated: http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

mvillar
  • 438
  • 1
  • 7
  • 19
  • I've seen this before.. By that logic, stuff is piling up after a load of 4 on this system (4 cores)? Or 8 if threads count as a lane? – kouton May 20 '15 at 07:37
1

It depends on the NATURE of the load. I.e. is it fixed demand driven, or is it generated by a process, which uses as much power as there is available to it (I think the former is the case, but I might be wrong here). OTOH 8 for 8 threads is not really a load on a server, so I'd start to worry if it exceeded 20.

Konrad Gajewski
  • 1,518
  • 3
  • 15
  • 29
  • 1
    Okay, so I can safely increase my warning thresholds to above 15 load and critical to say 20 for this machine? Would you have any links to back this up.. Asking because http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages says tasks will pile up for load beyond the CPU / thread count – kouton May 20 '15 at 12:47
  • I've seen systems with 60 load and still operating nominally. As I said above, I would need to now the exact nature of the work being done. It is not that simple as saying 20 is ok or 15 is ok. – Konrad Gajewski May 20 '15 at 14:41
  • Okay.. They are primarily PHP scripts that are computationally heavy; and Redis in memory (without persistence to disk). Essentially, no IO. – kouton May 20 '15 at 14:54
  • You mentioned the language in which the script is written saying nothing about the algorithm it is performing. :) But I assume it is just a part of a web page. I'd say 20 is not a disaster. Redis AFAIU is also demand based, so the above applies too. But as with any server, the bottom line here is when the content is served below the expected speed, no matter whether the server has 1 or 10 load. – Konrad Gajewski May 20 '15 at 15:23