We are running ubuntu 11.04 with 2.6.38-13-generic kernel on Intel(R) Xeon(R) CPU E5620 @ 2.40GHz with 48 GB RAM dedicated server with Hardware RAID.
top command output is showing many kernel threads running on different cores.
thread number
ksoftirqd - 16 (one on each core)
kworker - 35
migration - 16 (one on each core)
We already experienced two freezes and forced to restart the machine,both happened after we made modifications to .htaccess and then reloaded apache.
on syslog General Protection Fault was the last message logged.
After the restart most data on the hardisk became 0 bytes. 2.5 Gb data changed to 30 Mb soon after restart . :(
Is this because of any kernel bugs. on kernel.org 2.6.38-13 is not listed as a stable release.Does this mean that we need to change from current kernel to any stable one?? if so which kernel should we choose?
syslog output
isn't this a kernel spinlock case
May 2 22:34:01 416831 CRON[19206]: (root) CMD (bash /home/admin/log-children)
May 2 22:34:11 416831 kernel: [3715446.033031] general protection fault: 0000 [#1] SMP
May 2 22:34:11 416831 kernel: [3715446.054726] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
May 2 22:34:11 416831 kernel: [3715446.097404] CPU 5
May 2 22:34:11 416831 kernel: [3715446.097869] Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6 ip6t_LOG xt_tcpudp ipt_REDIRECT xt_conntrack iptable_mangle nf_conntrack_ftp ipt_REJECT ipt_LOG xt_limit xt_multiport xt_state ip6table_filter ip6_tables iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables vesafb snd_hda_intel snd_hda_codec psmouse ioatdma snd_hwdep i7core_edac ghes edac_core lp hed dca joydev snd_pcm serio_raw parport snd_timer snd soundcore snd_page_alloc usbhid hid e1000e
May 2 22:34:11 416831 kernel: [3715446.279465]
May 2 22:34:11 416831 kernel: [3715446.303429] Pid: 19118, comm: apache2 Not tainted 2.6.38-13-generic #56-Ubuntu Supermicro X8DTL/X8DTL
May 2 22:34:11 416831 kernel: [3715446.355544] RIP: 0010:[] [] task_rq_lock+0x4a/0xa0
May 2 22:34:11 416831 kernel: [3715446.411635] RSP: 0018:ffff88060b853da8 EFLAGS: 00010082
May 2 22:34:11 416831 kernel: [3715446.440241] RAX: 010021b86505c7ff RBX: 0000000000013d00 RCX: 00000001162d8937
May 2 22:34:11 416831 kernel: [3715446.497492] RDX: 0000000000000282 RSI: ffff88060b853df0 RDI: 00007fdac0088280
May 2 22:34:11 416831 kernel: [3715446.559362] RBP: ffff88060b853dc8 R08: 0000000000000040 R09: 001fc00000000000
May 2 22:34:11 416831 kernel: [3715446.625144] R10: 0000000000000000 R11: dead000000100100 R12: 00007fdac0088280
May 2 22:34:11 416831 kernel: [3715446.695569] R13: ffff88060b853df0 R14: 0000000000013d00 R15: 0000000000000005
May 2 22:34:11 416831 kernel: [3715446.770654] FS: 00007fdac0023760(0000) GS:ffff880c3fc20000(0000) knlGS:0000000000000000
May 2 22:34:11 416831 kernel: [3715446.849786] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 2 22:34:11 416831 kernel: [3715446.889882] CR2: 00007fdac187ca80 CR3: 000000058cda1000 CR4: 00000000000006e0
May 2 22:34:11 416831 kernel: [3715446.968627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 2 22:34:11 416831 kernel: [3715447.049676] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 2 22:34:11 416831 kernel: [3715447.130842] Process apache2 (pid: 19118, threadinfo ffff88060b852000, task ffff88058c11c4a0)
May 2 22:34:11 416831 kernel: [3715447.212160] Stack:
May 2 22:34:11 416831 kernel: [3715447.251311] 00007fdac0088280 ffff880be1ca5ec8 000000000000000f 0000000000000000
May 2 22:34:11 416831 kernel: [3715447.331017] ffff88060b853e28 ffffffff8105f2e1 0000000000000000 0000000081a4c270
May 2 22:34:11 416831 kernel: [3715447.412179] ffff88060b853e38 0000000000000282 0000000000000021 ffff880b92505ec8
May 2 22:34:11 416831 kernel: [3715447.493302] Call Trace:
May 2 22:34:11 416831 kernel: [3715447.533014] [] try_to_wake_up+0x31/0x3e0
May 2 22:34:11 416831 kernel: [3715447.573262] [] wake_up_process+0x15/0x20
May 2 22:34:11 416831 kernel: [3715447.612669] [] wake_up_sem_queue_do+0x37/0x60
May 2 22:34:11 416831 kernel: [3715447.651327] [] freeary+0x1c6/0x200
May 2 22:34:11 416831 kernel: [3715447.689083] [] semctl_down.clone.5+0xbb/0x110
May 2 22:34:11 416831 kernel: [3715447.726360] [] ? sys_kill+0x7e/0x90
May 2 22:34:11 416831 kernel: [3715447.762833] [] ? fput+0x25/0x30
May 2 22:34:11 416831 kernel: [3715447.798362] [] sys_semctl+0x7e/0xd0
May 2 22:34:11 416831 kernel: [3715447.833126] [] system_call_fastpath+0x16/0x1b
May 2 22:34:11 416831 kernel: [3715447.867350] Code: 00 48 c7 c3 00 3d 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 80 c8 aa 81 4c 89 f7 e8 53 4e 57 00 49 8b
May 2 22:34:11 416831 kernel: [3715447.970388] RIP [] task_rq_lock+0x4a/0xa0
May 2 22:34:11 416831 kernel: [3715448.004042] RSP
May 2 22:34:11 416831 kernel: [3715448.083219] ---[ end trace 244a1ec2d6f912fa ]---
May 2 22:35:01 416831 CRON[19243]: (root) CMD (bash /home/admin/log-children)