I've seen this with weird things that are stalled on the scheduler, usually in a system call. If you have non-vanilla kernel modules, definitely start there, even if they are included in the kernel tree. Kernel elements with a user-space element are one way to describe this and you may find that the user-space daemon is hanging on an external event, which hangs the kernel step in-between, which hangs a program asking questions of the kernel.
Network-based filesystem, and not just those that communicate over Ethernet, are prime suspects.
Check for processes not in the runnable state with ps -eo user,pid,stat,pcpu,args | grep -v " R"
USER PID STAT %CPU COMMAND
daemon 676 Ss 0.0 portmap
statd 752 Ss 0.0 rpc.statd -L
syslog 872 Sl 0.0 rsyslogd -c4
102 895 Ss 0.0 dbus-daemon --system --fork
avahi 934 S 0.0 avahi-daemon: running [faustus.local]
daemon 1082 Ss 0.0 atd
And you can decode the status from this table taken from the ps
man page.
D Uninterruptible sleep (usually IO)
R Running or runnable (on run queue)
S Interruptible sleep (waiting for an event to complete)
T Stopped, either by a job control signal or because it is being traced.
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z Defunct ("zombie") process, terminated but not reaped by its parent.
For BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group