1

I had two of my CPUs lock up on one of my servers. From dmesg:

BUG: soft lockup - CPU#1 stuck for 23s! [vmx-vcpu-0:6148]

and later:

BUG: soft lockup - CPU#2 stuck for 23s! [vmx-vcpu-0:6148]

I'm trying to figure out why this would happen; the processor has 4 cores with hyperthreading, so the OS sees it as 8 cores. But my main question is related to this:

When looking at htop post-freeze from SSH, I see that CPUs #2 and #3 (guessing these correspond to #1 and #2 from dmesg) are both stuck at 100% with apparently no processes using them:

htop

None of the processes were using more than 5% CPU. Why would these display 100% utilization? Are they still considered locked by the kernel?

Tom Marthenal
  • 2,116
  • 7
  • 25
  • 37
  • What OS/version/kernel/distribution. Also, is this a virtualized guest? – ewwhite Mar 14 '13 at 01:25
  • @ewwhite Ubuntu Server 12.04 LTS, linux-image-3.5.0-23-generic, bare metal host for some VMs (in VMware Workstation though, not ESX or anything). So far downgrading to Linux `3.5.0-22` seems to resolve the issue, but I can't reliably reproduce it so can't be sure. – Tom Marthenal Mar 14 '13 at 01:33

1 Answers1

3

As the message reports, this a bug in kernel-level code.

Those CPUs are stuck in a kernel code (vmx-cpu-0) that is not yield()ing control of the CPU for a long period of time.

As far as what to do - open a ticket with VMware. vmx-cpu-0 looks like their code, but I'm not totally sure.

MikeyB
  • 39,291
  • 10
  • 105
  • 189
  • 1
    [So apparently this is not a bug, it's a feature.](http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009996) – Tom Marthenal Mar 14 '13 at 05:27