Linux 内核崩溃-有什么想法吗?

Linux 内核崩溃-有什么想法吗?

在轻负载的 vps(debian squeeze kernel 2.6.38-bpo.2-amd64,2GB,SSD)上出现了这些失控情况 - 最高负载为零,然后跳到 30.0,一切都停滞不前。我猜 Xen Hypervisor 启动并限制了一切 - 供应商控制面板显示 CPU @ 175%。几分钟后,CPU 和负载将回落到正常水平,一切恢复正常。

这是最新的 kern.log 的副本

Oct 11 20:10:34 stage kernel: [348092.046302] Clocksource tsc unstable (delta = -8590343613 ns)
Oct 11 20:16:55 stage kernel: [348311.490907] INFO: task bounce:23869 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348311.490918] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348311.490924] bounce          D ffff880003085100     0 23869   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348311.490933]  ffff880003085100 0000000000000286 0000000000000000 ffff88007aaf3600
Oct 11 20:16:56 stage kernel: [348311.490943]  0000000000013700 ffff88007c16bfd8 ffff88007c16bfd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348311.490953]  ffff880003085100 ffff88007c16a010 ffff8800032bc098 ffffffff8103b96e
Oct 11 20:16:56 stage kernel: [348311.490963] Call Trace:
Oct 11 20:16:56 stage kernel: [348311.490973]  [<ffffffff8103b96e>] ? __wake_up+0x35/0x46
Oct 11 20:16:56 stage kernel: [348311.490982]  [<ffffffff81326e50>] ? _raw_spin_lock_irqsave+0x11/0x2f
Oct 11 20:16:56 stage kernel: [348311.490993]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348311.491000]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348311.491006]  [<ffffffff81326ea0>] ? _raw_spin_unlock_irqrestore+0x10/0x11
Oct 11 20:16:56 stage kernel: [348311.491015]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348311.491024]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348311.491030]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348311.491035]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348311.491041]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348311.491047]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348311.491053] INFO: task bounce:23873 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348311.491057] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348311.491063] bounce          D ffff880003083600     0 23873   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348311.491071]  ffff880003083600 0000000000000282 0000000000000000 ffff88007aaf3600
Oct 11 20:16:56 stage kernel: [348311.491080]  0000000000013700 ffff88006d859fd8 ffff88006d859fd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348311.491089]  ffff880003083600 ffff88006d858010 ffff88006d859eb0 0000000000000000
Oct 11 20:16:56 stage kernel: [348311.491098] Call Trace:
Oct 11 20:16:56 stage kernel: [348311.491104]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348311.491111]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348311.491118]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348311.491125]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348311.491130]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348311.491136]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348311.491141]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348311.491146]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348311.491152] INFO: task bounce:23875 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348311.491157] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348311.491162] bounce          D ffff880003085e80     0 23875   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348311.491170]  ffff880003085e80 0000000000000286 0000000000000000 ffff88007aaf3600
Oct 11 20:16:56 stage kernel: [348311.491179]  0000000000013700 ffff88004eecdfd8 ffff88004eecdfd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348311.491188]  ffff880003085e80 ffff88004eecc010 ffff880003085e80 0000000000000000
Oct 11 20:16:56 stage kernel: [348311.491198] Call Trace:
Oct 11 20:16:56 stage kernel: [348311.491203]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348311.491210]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348311.491217]  [<ffffffff8100679f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct 11 20:16:56 stage kernel: [348311.491223]  [<ffffffff81326ea0>] ? _raw_spin_unlock_irqrestore+0x10/0x11
Oct 11 20:16:56 stage kernel: [348311.491230]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348311.491237]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348311.491243]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348311.491248]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348311.491253]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348311.491258]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348431.612291] INFO: task bounce:23869 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348431.612305] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348431.612314] bounce          D ffff880003085100     0 23869   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348431.612327]  ffff880003085100 0000000000000286 0000000000000000 ffff88007aaf3600
Oct 11 20:16:56 stage kernel: [348431.612343]  0000000000013700 ffff88007c16bfd8 ffff88007c16bfd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348431.612357]  ffff880003085100 ffff88007c16a010 ffff8800032bc098 ffffffff8103b96e
Oct 11 20:16:56 stage kernel: [348431.612372] Call Trace:
Oct 11 20:16:56 stage kernel: [348431.612386]  [<ffffffff8103b96e>] ? __wake_up+0x35/0x46
Oct 11 20:16:56 stage kernel: [348431.612397]  [<ffffffff81326e50>] ? _raw_spin_lock_irqsave+0x11/0x2f
Oct 11 20:16:56 stage kernel: [348431.612413]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348431.612423]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348431.612432]  [<ffffffff81326ea0>] ? _raw_spin_unlock_irqrestore+0x10/0x11
Oct 11 20:16:56 stage kernel: [348431.612445]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348431.612461]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348431.612470]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348431.612478]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348431.612487]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348431.612495]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348431.612504] INFO: task bounce:23873 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348431.612511] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348431.612520] bounce          D ffff880003083600     0 23873   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348431.612531]  ffff880003083600 0000000000000282 0000000000000000 ffff88007aaf3600
Oct 11 20:16:56 stage kernel: [348431.612545]  0000000000013700 ffff88006d859fd8 ffff88006d859fd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348431.612559]  ffff880003083600 ffff88006d858010 ffff88006d859eb0 0000000000000000
Oct 11 20:16:56 stage kernel: [348431.612573] Call Trace:
Oct 11 20:16:56 stage kernel: [348431.612582]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348431.612591]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348431.612603]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348431.612612]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348431.612621]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348431.612629]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348431.612637]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348431.612645]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348431.612653] INFO: task bounce:23875 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348431.612660] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348431.612669] bounce          D ffff880003085e80     0 23875   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348431.612680]  ffff880003085e80 0000000000000286 0000000000000000 ffff88007aaf3600
Oct 11 20:16:56 stage kernel: [348431.612694]  0000000000013700 ffff88004eecdfd8 ffff88004eecdfd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348431.612708]  ffff880003085e80 ffff88004eecc010 ffff880003085e80 0000000000000000
Oct 11 20:16:56 stage kernel: [348431.612721] Call Trace:
Oct 11 20:16:56 stage kernel: [348431.612730]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348431.612739]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348431.612750]  [<ffffffff8100679f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct 11 20:16:56 stage kernel: [348431.612759]  [<ffffffff81326ea0>] ? _raw_spin_unlock_irqrestore+0x10/0x11
Oct 11 20:16:56 stage kernel: [348431.612770]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348431.612780]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348431.612788]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348431.612797]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348431.612804]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348431.612812]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348431.612820] INFO: task bounce:23876 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348431.612827] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348431.612836] bounce          D ffff880003081440     0 23876   1153 0x00000000
Oct 11 20:16:56 stage kernel: [348431.612847]  ffff880003081440 0000000000000282 0000000000000000 ffff88007abea1c0
Oct 11 20:16:56 stage kernel: [348431.612861]  0000000000013700 ffff88006d80ffd8 ffff88006d80ffd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348431.612874]  ffff880003081440 ffff88006d80e010 ffff88006d80feb0 0000000000000000
Oct 11 20:16:56 stage kernel: [348431.612888] Call Trace:
Oct 11 20:16:56 stage kernel: [348431.612897]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348431.612906]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348431.612917]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348431.612926]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348431.612935]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348431.612943]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348431.612951]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348431.612959]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:16:56 stage kernel: [348431.612968] INFO: task postdrop:23904 blocked for more than 120 seconds.
Oct 11 20:16:56 stage kernel: [348431.612976] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:16:56 stage kernel: [348431.612984] postdrop        D ffff88007b586540     0 23904  23903 0x00000000
Oct 11 20:16:56 stage kernel: [348431.612995]  ffff88007b586540 0000000000000282 0000000000000000 ffff88007abe9440
Oct 11 20:16:56 stage kernel: [348431.613009]  0000000000013700 ffff88004e94ffd8 ffff88004e94ffd8 0000000000013700
Oct 11 20:16:56 stage kernel: [348431.613023]  ffff88007b586540 ffff88004e94e010 ffff88007abe9440 0000000000000000
Oct 11 20:16:56 stage kernel: [348431.613036] Call Trace:
Oct 11 20:16:56 stage kernel: [348431.613045]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:16:56 stage kernel: [348431.613054]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:16:56 stage kernel: [348431.613065]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:16:56 stage kernel: [348431.613075]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:16:56 stage kernel: [348431.613083]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:16:56 stage kernel: [348431.613091]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:16:56 stage kernel: [348431.613099]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:16:56 stage kernel: [348431.613107]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:21:02 stage kernel: [348552.062201] INFO: task bounce:23869 blocked for more than 120 seconds.
Oct 11 20:21:02 stage kernel: [348552.062216] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:21:02 stage kernel: [348552.062225] bounce          D ffff880003085100     0 23869   1153 0x00000000
Oct 11 20:21:02 stage kernel: [348552.062238]  ffff880003085100 0000000000000286 0000000000000000 ffff88007aaf3600
Oct 11 20:21:02 stage kernel: [348552.062253]  0000000000013700 ffff88007c16bfd8 ffff88007c16bfd8 0000000000013700
Oct 11 20:21:02 stage kernel: [348552.062268]  ffff880003085100 ffff88007c16a010 ffff8800032bc098 ffffffff8103b96e
Oct 11 20:21:02 stage kernel: [348552.062282] Call Trace:
Oct 11 20:21:02 stage kernel: [348552.062301]  [<ffffffff8103b96e>] ? __wake_up+0x35/0x46
Oct 11 20:21:02 stage kernel: [348552.062313]  [<ffffffff81326e50>] ? _raw_spin_lock_irqsave+0x11/0x2f
Oct 11 20:21:02 stage kernel: [348552.062329]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:21:02 stage kernel: [348552.062339]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:21:02 stage kernel: [348552.062349]  [<ffffffff81326ea0>] ? _raw_spin_unlock_irqrestore+0x10/0x11
Oct 11 20:21:02 stage kernel: [348552.062361]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:21:02 stage kernel: [348552.062373]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:21:02 stage kernel: [348552.062382]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:21:02 stage kernel: [348552.062390]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:21:02 stage kernel: [348552.062399]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:21:02 stage kernel: [348552.062407]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b
Oct 11 20:21:02 stage kernel: [348552.062416] INFO: task bounce:23873 blocked for more than 120 seconds.
Oct 11 20:21:02 stage kernel: [348552.062423] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 11 20:21:02 stage kernel: [348552.062431] bounce          D ffff880003083600     0 23873   1153 0x00000000
Oct 11 20:21:02 stage kernel: [348552.062442]  ffff880003083600 0000000000000282 0000000000000000 ffff88007aaf3600
Oct 11 20:21:02 stage kernel: [348552.062456]  0000000000013700 ffff88006d859fd8 ffff88006d859fd8 0000000000013700
Oct 11 20:21:02 stage kernel: [348552.062470]  ffff880003083600 ffff88006d858010 ffff88006d859eb0 0000000000000000
Oct 11 20:21:02 stage kernel: [348552.062484] Call Trace:
Oct 11 20:21:02 stage kernel: [348552.062493]  [<ffffffffa00249ec>] ? log_wait_commit+0xc0/0x111 [jbd]
Oct 11 20:21:02 stage kernel: [348552.062502]  [<ffffffff8106033d>] ? autoremove_wake_function+0x0/0x2a
Oct 11 20:21:02 stage kernel: [348552.062513]  [<ffffffffa00377f6>] ? ext3_sync_file+0xbe/0xec [ext3]
Oct 11 20:21:02 stage kernel: [348552.062522]  [<ffffffff81117733>] ? vfs_fsync_range+0x4c/0x73
Oct 11 20:21:02 stage kernel: [348552.062531]  [<ffffffff811177db>] ? do_fsync+0x27/0x3c
Oct 11 20:21:02 stage kernel: [348552.062539]  [<ffffffff8111780d>] ? sys_fsync+0xb/0xf
Oct 11 20:21:02 stage kernel: [348552.062547]  [<ffffffff81009973>] ? sysret_check+0x17/0x5a
Oct 11 20:21:02 stage kernel: [348552.062554]  [<ffffffff81009952>] ? system_call_fastpath+0x16/0x1b

关于去哪里看,您有什么想法吗?

[编辑] 添加几天前发生的另一个软锁定 - 与反弹无关:

Oct  7 06:26:39 stage kernel: [5729599.258122] BUG: soft lockup - CPU#1 stuck for 80s! [events/1:16]
Oct  7 06:26:39 stage kernel: [5729599.258122] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp ipt_LOG iptable_filter ip_tables x_tables loop snd_pcm snd_timer evdev snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
Oct  7 06:26:39 stage kernel: [5729599.258122] CPU 1:
Oct  7 06:26:39 stage kernel: [5729599.258122] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp ipt_LOG iptable_filter ip_tables x_tables loop snd_pcm snd_timer evdev snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
Oct  7 06:26:39 stage kernel: [5729599.258122] Pid: 16, comm: events/1 Not tainted 2.6.32-5-amd64 #1 
Oct  7 06:26:39 stage kernel: [5729599.258122] RIP: e030:[<ffffffff8100922a>]  [<ffffffff8100922a>] hypercall_page+0x22a/0x1001
Oct  7 06:26:39 stage kernel: [5729599.258122] RSP: e02b:ffff88007ffb1cc8  EFLAGS: 00000246
Oct  7 06:26:39 stage kernel: [5729599.258122] RAX: 0000000000040001 RBX: ffff880003519780 RCX: ffffffff8100922a
Oct  7 06:26:39 stage kernel: [5729599.258122] RDX: 00000000000116e0 RSI: 0000000000000000 RDI: 0000000000000000
Oct  7 06:26:39 stage kernel: [5729599.258122] RBP: ffff880001a77100 R08: ffff88007ffb0000 R09: ffffffff8100e22f
Oct  7 06:26:54 stage kernel: [5729599.258122] R10: 0000000000000000 R11: 0000000000000246 R12: ffff88007ffa8000
Oct  7 06:26:54 stage kernel: [5729599.258122] R13: ffff8800008e3880 R14: 0000000000000000 R15: 0000000000000000
Oct  7 06:26:54 stage kernel: [5729599.258122] FS:  00007f1e47300720(0000) GS:ffff880003504000(0000) knlGS:0000000000000000
Oct  7 06:26:54 stage kernel: [5729599.258122] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct  7 06:26:54 stage kernel: [5729599.258122] CR2: 000000000065e000 CR3: 0000000001001000 CR4: 0000000000002660
Oct  7 06:26:54 stage kernel: [5729599.258122] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct  7 06:26:54 stage kernel: [5729599.258122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct  7 06:26:54 stage kernel: [5729599.258122] Call Trace:
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8100dc35>] ? xen_force_evtchn_callback+0x9/0xa
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8100e242>] ? check_events+0x12/0x20
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8100e22f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8100e1e9>] ? xen_irq_enable_direct_end+0x0/0x7
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8104827b>] ? finish_task_switch+0x44/0xaf
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff812fb155>] ? thread_return+0x4e/0xe0
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8100e242>] ? check_events+0x12/0x20
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8106182b>] ? worker_thread+0xcc/0x21d
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff810c8060>] ? vmstat_update+0x0/0x39
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff81064f1a>] ? autoremove_wake_function+0x0/0x2e
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8106175f>] ? worker_thread+0x0/0x21d
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff81064c4d>] ? kthread+0x79/0x81
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff81011baa>] ? child_rip+0xa/0x20
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff81010d61>] ? int_ret_from_sys_call+0x7/0x1b
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8101151d>] ? retint_restore_args+0x5/0x6
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff8100e22f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct  7 06:26:54 stage kernel: [5729599.258122]  [<ffffffff81011ba0>] ? child_rip+0x0/0x20
Oct  7 06:26:56 stage /USR/SBIN/CRON[12186]: (root) CMD (/root/phptest.sh)
Oct  7 06:27:03 stage kernel: [5729625.855236] BUG: soft lockup - CPU#0 stuck for 65s! [watchdog/0:5]
Oct  7 06:27:03 stage kernel: [5729625.855236] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp ipt_LOG iptable_filter ip_tables x_tables loop snd_pcm snd_timer evdev snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
Oct  7 06:27:03 stage kernel: [5729625.855236] CPU 0:
Oct  7 06:27:03 stage kernel: [5729625.855236] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp ipt_LOG iptable_filter ip_tables x_tables loop snd_pcm snd_timer evdev snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
Oct  7 06:27:03 stage kernel: [5729625.855236] Pid: 5, comm: watchdog/0 Not tainted 2.6.32-5-amd64 #1 
Oct  7 06:27:03 stage kernel: [5729625.855236] RIP: e030:[<ffffffff8100922a>]  [<ffffffff8100922a>] hypercall_page+0x22a/0x1001
Oct  7 06:27:03 stage kernel: [5729625.855236] RSP: e02b:ffff88007ff6fd48  EFLAGS: 00000246
Oct  7 06:27:03 stage kernel: [5729625.855236] RAX: 0000000000040001 RBX: ffff8800034fb780 RCX: ffffffff8100922a
Oct  7 06:27:03 stage kernel: [5729625.855236] RDX: 00000000000116e0 RSI: 0000000000000000 RDI: 0000000000000000
Oct  7 06:27:03 stage kernel: [5729625.855236] RBP: ffff88007e1c4000 R08: ffff88007ff6e000 R09: ffffffff8100e22f
Oct  7 06:27:05 stage kernel: [5729625.855236] R10: ffff880002d23100 R11: 0000000000000246 R12: ffff88007ff53170
Oct  7 06:27:05 stage kernel: [5729625.855236] R13: ffff8800230b9530 R14: 0000000000000000 R15: 0000000000000000
Oct  7 06:27:05 stage kernel: [5729625.855236] FS:  00007f1e47300720(0000) GS:ffff8800034e6000(0000) knlGS:0000000000000000
Oct  7 06:27:05 stage kernel: [5729625.855236] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct  7 06:27:05 stage kernel: [5729625.855236] CR2: 00007f3d34099120 CR3: 0000000001001000 CR4: 0000000000002660
Oct  7 06:27:05 stage kernel: [5729625.855236] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct  7 06:27:05 stage kernel: [5729625.855236] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct  7 06:27:05 stage kernel: [5729625.855236] Call Trace:
Oct  7 06:27:05 stage kernel: [5729625.855236]  [<ffffffff8100dc35>] ? xen_force_evtchn_callback+0x9/0xa
Oct  7 06:27:05 stage kernel: [5729625.855236]  [<ffffffff8100e242>] ? check_events+0x12/0x20
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8100e22f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8100e1e9>] ? xen_irq_enable_direct_end+0x0/0x7
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8104827b>] ? finish_task_switch+0x44/0xaf
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff812fb155>] ? thread_return+0x4e/0xe0
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8104914d>] ? switched_to_rt+0x0/0x58
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8100e22f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff810693bf>] ? cpu_clock+0x28/0x30
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8109428f>] ? watchdog+0x0/0x75
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff810942d3>] ? watchdog+0x44/0x75
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8109428f>] ? watchdog+0x0/0x75
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff81064c4d>] ? kthread+0x79/0x81
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff81011baa>] ? child_rip+0xa/0x20
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff81010d61>] ? int_ret_from_sys_call+0x7/0x1b
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8101151d>] ? retint_restore_args+0x5/0x6
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff8100e22f>] ? xen_restore_fl_direct_end+0x0/0x1
Oct  7 06:27:06 stage kernel: [5729625.855236]  [<ffffffff81011ba0>] ? child_rip+0x0/0x20

答案1

你的内核没有崩溃。

您正在运行 postfix 吗?据我所知,这是唯一一个有可执行文件“bounce”的程序。

看起来 bounce 正在等待读/写/flock 完成,事实上,查看堆栈,它似乎在 ext3_sync_file() 处进入休眠状态

文件系统是否使用同步选项挂载?虽然将其更改为异步无法解决问题,但在解决底层故障后,它将提高性能。

我首先会检查磁盘是否处于停转/禁用状态,然后尝试对磁盘上的所有分区运行 fsck,如果一切正常,则检查 bounce 正在访问的文件是否存在争用(例如,如果是 Postfix,是否有多个实例正在运行?或者您在发生这种情况时是否正在运行备份?)。如果什么都没显示出来,请尝试将数据文件移动到其他物理磁盘。

(并检查你的内核补丁是否是最新的 - 似乎有关于类似问题的报告 - 虽然我在这个特定版本中没有看到过)

相关内容