1

Please, help to deal with a strange problem: Have a KVM host with 2 Intel(R) Xeon(R) CPU E5-2630 v2 (24 virtual cores total). This host carries 3 typical guests with ubuntu - 8 cores each, 20Gb ram. In such configuration everything seems to be ok. When trying to deploy one more guest with the same configuration, strange things begin to happen - even with no load on other 3 guests and when giving some reasonable load on the 4rth, %sy cpu usage on kvm host goes to 25-30%, top typically is such:

top - 14:29:39 up 104 days,  2:51,  6 users,  load average: 6.46, 6.33, 4.81
Tasks: 227 total,   1 running, 226 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.0%us, 25.2%sy,  0.0%ni, 69.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  98975536k total, 48515312k used, 50460224k free,   154456k buffers
Swap: 100628476k total,     2176k used, 100626300k free,  1072440k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                       
27523 libvirt-  20   0 21.1g  10g 6880 S  700 11.5 126:27.51 kvm                                                                                                           
11745 libvirt-  20   0 21.1g  20g 6964 S   21 21.4 137891:19 kvm                                                                                                           
32692 root      20   0  865m 8792 4532 S    1  0.0  28:49.51 libvirtd                                                                                                      
23252 libvirt-  20   0 10.7g 1.0g 6840 S    1  1.1   6:54.43 kvm                                                                                                           
  117 root      25   5     0    0    0 S    0  0.0   1245:09 ksmd                                                                                                          
 1481 root      20   0 63784  12m 3880 S    0  0.0  54:34.15 gunicorn                                                                                                      
22413 root      20   0 17464 1540 1092 S    0  0.0   4:21.68 top                                                                                                           
22880 root      20   0 17452 1396  972 S    0  0.0   3:50.49 top                                                                                                           
22885 root      20   0 73444 3564 2772 S    0  0.0   2:54.02 sshd                                                                                                          
26008 root      20   0 17460 1528 1088 S    0  0.0   0:07.31 top                                                                                                           
26530 root      20   0 17472 1412  972 S    0  0.0   0:05.43 top                                                                                                           
    1 root      20   0 24448 2324 1344 S    0  0.0   0:04.69 init  

(27523 is the problematic guest, another kvm process is guest with no load)

Guest at this moment becomes not operable, LA begins to grow to 50-80 and even higher, practically all cpu usage is distributed between %us and %sy in different proprotions

top - 14:38:21 up 37 min,  2 users,  load average: 53.72, 59.50, 45.16
Tasks: 313 total,   9 running, 301 sleeping,   0 stopped,   3 zombie
Cpu(s): 67.5%us, 31.9%sy,  0.0%ni,  0.0%id,  0.4%wa,  0.0%hi,  0.0%si,  0.1%st
Mem:  20590644k total, 11358672k used,  9231972k free,    59020k buffers
Swap: 10483708k total,        0k used, 10483708k free,  1821100k cached

at some moment there begin to come exceptions:

2014 Sep 17 14:35:09 dev2 [ 2037.438362] Stack:
2014 Sep 17 14:35:09 dev2 [ 2037.438370] Call Trace:
2014 Sep 17 14:35:09 dev2 [ 2037.438429] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f 
2014 Sep 17 14:35:09 dev2 [ 2037.441963] Stack:
2014 Sep 17 14:35:09 dev2 [ 2037.443586] Call Trace:
2014 Sep 17 14:35:19 dev2 [ 2037.443586] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f 
2014 Sep 17 14:35:45 dev2 [ 2073.284329] Stack:
2014 Sep 17 14:35:45 dev2 [ 2073.285151] Stack:
2014 Sep 17 14:35:45 dev2 [ 2073.285159] Call Trace:
2014 Sep 17 14:35:45 dev2 [ 2073.285221] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f 
2014 Sep 17 14:35:56 dev2 [ 2073.285857] Stack:
2014 Sep 17 14:35:56 dev2 [ 2073.285864] Call Trace:
2014 Sep 17 14:35:56 dev2 [ 2073.285914] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f 
2014 Sep 17 14:35:56 dev2 [ 2073.290207] Call Trace:
2014 Sep 17 14:35:56 dev2 [ 2073.290207] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f 

The configuration of the guest is typical, we have dozens of them working fine, also we have KVM hosts with 4 such guests also with no problems. Where should I dig to find the root of the problem here? No ideas now...

Host runs Ubuntu LTS 12.04 (Linux vhost12 3.2.0-60-generic #91-Ubuntu SMP Wed Feb 19 03:54:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux), guest is the same but 3.2.0-56-generic

user1932286
  • 71
  • 1
  • 5

0 Answers0