Please, help to deal with a strange problem: Have a KVM host with 2 Intel(R) Xeon(R) CPU E5-2630 v2 (24 virtual cores total). This host carries 3 typical guests with ubuntu - 8 cores each, 20Gb ram. In such configuration everything seems to be ok. When trying to deploy one more guest with the same configuration, strange things begin to happen - even with no load on other 3 guests and when giving some reasonable load on the 4rth, %sy cpu usage on kvm host goes to 25-30%, top typically is such:
top - 14:29:39 up 104 days, 2:51, 6 users, load average: 6.46, 6.33, 4.81
Tasks: 227 total, 1 running, 226 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.0%us, 25.2%sy, 0.0%ni, 69.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 98975536k total, 48515312k used, 50460224k free, 154456k buffers
Swap: 100628476k total, 2176k used, 100626300k free, 1072440k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27523 libvirt- 20 0 21.1g 10g 6880 S 700 11.5 126:27.51 kvm
11745 libvirt- 20 0 21.1g 20g 6964 S 21 21.4 137891:19 kvm
32692 root 20 0 865m 8792 4532 S 1 0.0 28:49.51 libvirtd
23252 libvirt- 20 0 10.7g 1.0g 6840 S 1 1.1 6:54.43 kvm
117 root 25 5 0 0 0 S 0 0.0 1245:09 ksmd
1481 root 20 0 63784 12m 3880 S 0 0.0 54:34.15 gunicorn
22413 root 20 0 17464 1540 1092 S 0 0.0 4:21.68 top
22880 root 20 0 17452 1396 972 S 0 0.0 3:50.49 top
22885 root 20 0 73444 3564 2772 S 0 0.0 2:54.02 sshd
26008 root 20 0 17460 1528 1088 S 0 0.0 0:07.31 top
26530 root 20 0 17472 1412 972 S 0 0.0 0:05.43 top
1 root 20 0 24448 2324 1344 S 0 0.0 0:04.69 init
(27523 is the problematic guest, another kvm process is guest with no load)
Guest at this moment becomes not operable, LA begins to grow to 50-80 and even higher, practically all cpu usage is distributed between %us and %sy in different proprotions
top - 14:38:21 up 37 min, 2 users, load average: 53.72, 59.50, 45.16
Tasks: 313 total, 9 running, 301 sleeping, 0 stopped, 3 zombie
Cpu(s): 67.5%us, 31.9%sy, 0.0%ni, 0.0%id, 0.4%wa, 0.0%hi, 0.0%si, 0.1%st
Mem: 20590644k total, 11358672k used, 9231972k free, 59020k buffers
Swap: 10483708k total, 0k used, 10483708k free, 1821100k cached
at some moment there begin to come exceptions:
2014 Sep 17 14:35:09 dev2 [ 2037.438362] Stack:
2014 Sep 17 14:35:09 dev2 [ 2037.438370] Call Trace:
2014 Sep 17 14:35:09 dev2 [ 2037.438429] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f
2014 Sep 17 14:35:09 dev2 [ 2037.441963] Stack:
2014 Sep 17 14:35:09 dev2 [ 2037.443586] Call Trace:
2014 Sep 17 14:35:19 dev2 [ 2037.443586] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f
2014 Sep 17 14:35:45 dev2 [ 2073.284329] Stack:
2014 Sep 17 14:35:45 dev2 [ 2073.285151] Stack:
2014 Sep 17 14:35:45 dev2 [ 2073.285159] Call Trace:
2014 Sep 17 14:35:45 dev2 [ 2073.285221] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f
2014 Sep 17 14:35:56 dev2 [ 2073.285857] Stack:
2014 Sep 17 14:35:56 dev2 [ 2073.285864] Call Trace:
2014 Sep 17 14:35:56 dev2 [ 2073.285914] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f
2014 Sep 17 14:35:56 dev2 [ 2073.290207] Call Trace:
2014 Sep 17 14:35:56 dev2 [ 2073.290207] Code: 48 89 45 c0 48 8d 45 d0 4c 89 4d f8 c7 45 b8 10 00 00 00 48 89 45 c8 e8 e8 f6 ff ff c9 c3 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f
The configuration of the guest is typical, we have dozens of them working fine, also we have KVM hosts with 4 such guests also with no problems. Where should I dig to find the root of the problem here? No ideas now...
Host runs Ubuntu LTS 12.04 (Linux vhost12 3.2.0-60-generic #91-Ubuntu SMP Wed Feb 19 03:54:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux), guest is the same but 3.2.0-56-generic