LVM相关进程完全挂起

LVM相关进程完全挂起

这个问题属于 KVM 节点,虚拟机通过 LVM 获取存储。因此,每个虚拟机都有自己的逻辑卷。每天晚上都会备份一些虚拟机(快照 -dd [..] | ssh [..]没什么特别的)。但是,昨晚不知何故搞砸了 LVM 系统。第二次备份开始后 2-3 分钟,内核开始记录“挂起任务” - 简而言之,它报告三个 qemu-kvm 进程挂起,dd 进程也挂起。至少有一个虚拟机(它是托管服务器,因此由我们监控)停机 - 更准确地说:它仍在运行,但服务不再响应。VNC 显示虚拟机内的挂起任务。硬重置(和迁移 - 见下文)后,虚拟机正常,但进程dd从未终止(kill -9不执行任何操作),并且命令lvdisplay不再起作用 - 它们只是什么也不给出。lvmetad也无法重新启动,并且无法终止属于 LVM 的每个进程。它们只是永远挂在磁盘状态,而节点通常运行良好。宕机的虚拟机必须迁移到另一个节点,因为它virsh shutdown也无法再工作了 - “设备或资源繁忙”。但其他虚拟机仍在继续工作。

几周前,我们在另一个节点上也遇到了这个问题,其中“快照”虚拟机也宕机了,我们将内核从 4.4 升级到了 4.9(因为我们无论如何都要重启机器),之后再也没有看到类似的问题。但是,由于今天出现问题的节点已经正常运行了两个月,所以这并不意味着问题已经得到解决。所以 - 有人能比我们更清楚地看到这些日志吗?非常感谢。感谢您的阅读!

Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:32970 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff88734767f908     0 32970      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff88734767f908 ffff880166d65900 ffff887048ef0000 ffff887347680000
Apr 28 00:37:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88492b5b8a00
Apr 28 00:37:15 vnode19 kernel: ffff88734767f920 ffffffff816b2425 ffff887f7f116cc0 ffff88734767f9d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:33655 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff886a1dd23908     0 33655      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff886a1dd23908 ffff8875c6e442c0 ffff88582127ac80 ffff886a1dd24000
Apr 28 00:37:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff886d0d021e00
Apr 28 00:37:15 vnode19 kernel: ffff886a1dd23920 ffffffff816b2425 ffff887f7f496cc0 ffff886a1dd239d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:33661 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff8855341f3728     0 33661      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff8855341f3728 ffff880166d642c0 ffff886916a4c2c0 ffff8855341f4000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff8855341f3740 ffffffff816b2425 ffff886916a4c2c0 ffff8855341f37d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06dfdfe>] __origin_write+0x6e/0x210 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811918ae>] ? mempool_alloc+0x6e/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0007>] do_origin.isra.14+0x67/0x90 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0092>] origin_map+0x62/0x80 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81300deb>] ? bio_alloc_bioset+0x1ab/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124ccb7>] do_blockdev_direct_IO+0x2427/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120cf59>] __vfs_write+0xc9/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d5b2>] vfs_write+0xa2/0x1a0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e537>] SyS_pwrite64+0x87/0xb0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task dmeventd:33781 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: dmeventd        D ffff8803493b7af8     0 33781      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff8803493b7af8 ffff880166da1640 ffff880b15a50000 ffff8803493b8000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff8803493b7b10 ffffffff816b2425 ffff880b15a50000 ffff8803493b7b98
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06df172>] snapshot_status+0x82/0x1a0 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b51a6>] retrieve_status+0xa6/0x1b0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6363>] table_status+0x63/0xa0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6300>] ? dm_get_live_or_inactive_table.isra.3+0x30/0x30 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6015>] ctl_ioctl+0x255/0x4d0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81095806>] ? __dequeue_signal+0x106/0x1b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81095a1b>] ? recalc_sigpending+0x1b/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b62a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81220872>] do_vfs_ioctl+0x2d2/0x4b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81220ac9>] SyS_ioctl+0x79/0x90
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task dd:33790 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: dd              D ffff885238e1f828     0 33790  33746 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff885238e1f828 ffff883f77ce42c0 ffff884a64088000 ffff885238e20000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff885238e1f840 ffffffff816b2425 ffff884a64088000 ffff885238e1f8d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0d32>] snapshot_map+0x62/0x390 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d6ba>] mpage_bio_submit+0x2a/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124e0b0>] mpage_readpages+0x130/0x160
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811e0428>] ? alloc_pages_current+0x88/0x120
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247add>] blkdev_readpages+0x1d/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119bfbc>] __do_page_cache_readahead+0x19c/0x220
Apr 28 00:37:15 vnode19 kernel: [<ffffffff810b4c39>] ? try_to_wake_up+0x49/0x3d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119c175>] ondemand_readahead+0x135/0x260
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae0aa>] ? dm_any_congested+0x4a/0x50 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119c30c>] page_cache_async_readahead+0x6c/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190748>] generic_file_read_iter+0x438/0x680
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81215e79>] ? pipe_write+0x3d9/0x430
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247da7>] blkdev_read_iter+0x37/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120ce56>] __vfs_read+0xc6/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d45f>] vfs_read+0x7f/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e2d5>] SyS_read+0x55/0xc0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:32970 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff88734767f908     0 32970      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff88734767f908 ffff880166d65900 ffff887048ef0000 ffff887347680000
Apr 28 00:39:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88492b5b8a00
Apr 28 00:39:15 vnode19 kernel: ffff88734767f920 ffffffff816b2425 ffff887f7f116cc0 ffff88734767f9d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:33655 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff886a1dd23908     0 33655      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff886a1dd23908 ffff8875c6e442c0 ffff88582127ac80 ffff886a1dd24000
Apr 28 00:39:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff886d0d021e00
Apr 28 00:39:15 vnode19 kernel: ffff886a1dd23920 ffffffff816b2425 ffff887f7f496cc0 ffff886a1dd239d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:33661 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff8855341f3728     0 33661      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff8855341f3728 ffff880166d642c0 ffff886916a4c2c0 ffff8855341f4000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff8855341f3740 ffffffff816b2425 ffff886916a4c2c0 ffff8855341f37d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06dfdfe>] __origin_write+0x6e/0x210 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811918ae>] ? mempool_alloc+0x6e/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0007>] do_origin.isra.14+0x67/0x90 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0092>] origin_map+0x62/0x80 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81300deb>] ? bio_alloc_bioset+0x1ab/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124ccb7>] do_blockdev_direct_IO+0x2427/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120cf59>] __vfs_write+0xc9/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d5b2>] vfs_write+0xa2/0x1a0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e537>] SyS_pwrite64+0x87/0xb0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task dmeventd:33781 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: dmeventd        D ffff8803493b7af8     0 33781      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff8803493b7af8 ffff880166da1640 ffff880b15a50000 ffff8803493b8000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff8803493b7b10 ffffffff816b2425 ffff880b15a50000 ffff8803493b7b98
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06df172>] snapshot_status+0x82/0x1a0 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b51a6>] retrieve_status+0xa6/0x1b0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6363>] table_status+0x63/0xa0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6300>] ? dm_get_live_or_inactive_table.isra.3+0x30/0x30 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6015>] ctl_ioctl+0x255/0x4d0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81095806>] ? __dequeue_signal+0x106/0x1b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81095a1b>] ? recalc_sigpending+0x1b/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b62a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81220872>] do_vfs_ioctl+0x2d2/0x4b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81220ac9>] SyS_ioctl+0x79/0x90
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task dd:33790 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: dd              D ffff885238e1f828     0 33790  33746 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff885238e1f828 ffff883f77ce42c0 ffff884a64088000 ffff885238e20000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff885238e1f840 ffffffff816b2425 ffff884a64088000 ffff885238e1f8d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0d32>] snapshot_map+0x62/0x390 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d6ba>] mpage_bio_submit+0x2a/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124e0b0>] mpage_readpages+0x130/0x160
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811e0428>] ? alloc_pages_current+0x88/0x120
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247add>] blkdev_readpages+0x1d/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119bfbc>] __do_page_cache_readahead+0x19c/0x220
Apr 28 00:39:15 vnode19 kernel: [<ffffffff810b4c39>] ? try_to_wake_up+0x49/0x3d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119c175>] ondemand_readahead+0x135/0x260
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae0aa>] ? dm_any_congested+0x4a/0x50 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119c30c>] page_cache_async_readahead+0x6c/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190748>] generic_file_read_iter+0x438/0x680
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81215e79>] ? pipe_write+0x3d9/0x430
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247da7>] blkdev_read_iter+0x37/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120ce56>] __vfs_read+0xc6/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d45f>] vfs_read+0x7f/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e2d5>] SyS_read+0x55/0xc0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71

答案1

我认为您已经排除了实际的物理磁盘问题。

我还假设您确保主机和所有虚拟机的 VG 名称均不重叠。这可能会导致您所描述的疯狂情况。

您所看到的听起来像是“不间断睡眠”,其中盒子认为它正在等待 IO,并且没有什么可以改变这一点。 Kill -9 甚至不会这样做。我以前在磁带备份中看到过这种情况。我最近在做一些愚蠢的事情时也看到了这种情况,比如在主机上安装虚拟机 LVM 并在运行虚拟机时忘记卸载它。这总是很有趣。

对于您所描述的情况,我发现最有用的工具是dmsetup。它可以让您手动取消 LVM。我不知道这是否能让您摆脱不间断的睡眠状态。

另一种可能性是您正在使用速度较慢的磁盘,某些操作确实需要超过 120 秒的时间。

我使用磁盘文件 ala qemu-img 而不是 LVM。我曾经使用过 LVM(就像您在 Xen 中描述的那样),但从未遇到过任何显然不是我自己造成的问题。

-迪伦

相关内容