I'm writing a Linux module, in which I have a loop to process work like below:
while (1) {
while (there's work) {
process_work
}
if (should_stop)
break
sleep // wait to be woken up
}
When there's lots of work, it would result in softlockup. The message is like this:
[ 1426.067061] BUG: soft lockup - CPU#3 stuck for 23s! [comp_wqa:2969]
[ 1426.067903] Modules linked in: testmodule(OE+) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter hwmon_vid dm_mirror dm_region_hash dm_log dm_mod snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic intel_powerclamp coretemp intel_rapl kvm eeepc_wmi crc32_pclmul asus_wmi ghash_clmulni_intel sparse_keymap rfkill mxm_wmi aesni_intel wmi lrw snd_hda_intel gf128mul glue_helper snd_hda_codec pcspkr ablk_helper sg
[ 1426.067924] cryptd shpchp snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm tpm_infineon acpi_pad snd_timer mei_me mei snd soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel serio_raw i915 ahci libahci libata i2c_algo_bit drm_kms_helper drm e1000e ptp pps_core i2c_core video
[ 1426.067939] CPU: 3 PID: 2969 Comm: comp_wqa Tainted: G OE ------------ 3.10.0-327.28.3.el7.x86_64 #1
[ 1426.067940] Hardware name: ASUS All Series/Z97-A, BIOS 2401 04/24/2015
[ 1426.067941] task: ffff88080f212280 ti: ffff880810a68000 task.ti: ffff880810a68000
[ 1426.067942] RIP: 0010:[<ffffffff8107e11f>] [<ffffffff8107e11f>] vprintk_emit+0x1bf/0x530
[ 1426.067946] RSP: 0018:ffff880810a6bbc0 EFLAGS: 00000246
[ 1426.067947] RAX: 0000000000000001 RBX: 0000000000000003 RCX: 0000000000000000
[ 1426.067948] RDX: 0000000000000001 RSI: ffff88083fb8f6c8 RDI: 0000000000000246
[ 1426.067948] RBP: ffff880810a6bc20 R08: 0000000000000092 R09: 0000000000007d0d
[ 1426.067949] R10: 0000000000008000 R11: ffffc90023effff8 R12: 0000000000000081
[ 1426.067950] R13: ffffffff81a08020 R14: 000000009176cc6c R15: 0000000000000000
[ 1426.067951] FS: 0000000000000000(0000) GS:ffff88083fb80000(0000) knlGS:0000000000000000
[ 1426.067951] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1426.067952] CR2: 00007f42411ff00e CR3: 000000000194a000 CR4: 00000000001407e0
[ 1426.067953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1426.067954] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1426.067954] Stack:
[ 1426.067955] ffffffff81cae082 0000000000000071 0000000000000000 ffff880810a6bc40
[ 1426.067956] ffffffffa07a45a0 000000008116c24e 0000000000000246 ffff8807dfbba800
[ 1426.067958] ffff880810a70000 ffff8807dfbc5030 ffff8807dfbc4e00 ffff8807e65b3000
[ 1426.067959] Call Trace:
So after some googling, I change the code to the following:
while (1) {
while (there's work) {
process_work
cond_resched()
}
if (should_stop)
break
sleep // wait to be woken up
}
And with this code, the softlockups happens less likely. But still, it happens with heavier load. I thought if this thread had been occupying the cpu for long, then cond_resched
would give up the cpu. I guess I was wrong.
I want to know how should the softlockups be avoided and at the same without being idle too much (I want the module process lots of work withou long latency).
After thinking more about this, I realize what I want is just make a cpu core run a dedicate thread, without being interrupted. It seems the kernel doesn't support this directly. There is a kernel parameter called watchdog_thresh
which decides how many seconds can a thread continuously run. I have read other posts that suggest this kind of softlockups is harmless. And I now understand more deeply that the performance of my driver is heavily dependent on single cpu core performance, since I have to process the work with a single thread.