qemu-kvm 崩溃“避免资源死锁”

qemu-kvm 崩溃“避免资源死锁”

我遇到 qemu-kvm 崩溃的情况:

terminate called after throwing an instance of 'std::system_error'
  what():  Resource deadlock avoided
2022-10-10 13:40:53.238+0000: shutting down, reason=crashed

主机操作系统:Ubuntu 20.04

/usr/bin/qemu-system-x86_64 --版本

QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.23)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

是否存在以下错误:

qemu-system-x86/focal-updates,focal-security,now 1:4.2-3ubuntu6.23 amd64 [installed,automatic]

librbd:

librbd1/stable,now 16.2.10-1focal amd64 [installed,automatic]
libvirt-daemon-driver-storage-rbd/focal-updates,focal-security,now 6.0.0-0ubuntu8.16 amd64 [installed,automatic]
python3-rbd/stable,now 16.2.10-1focal amd64 [installed,automatic]

qemu-kvm 从 ceph rdb 启动,将会崩溃。

核心转储:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f2c2aec3859 in __GI_abort () at abort.c:79
#2  0x00007f2c286f7911 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f2c2870338c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f2c28702369 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f2c28702d21 in __gxx_personality_v0 () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f2c2b0c6bef in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f2c2b0c7281 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8  0x00007f2c2870369c in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007f2c286fa73f in std::__throw_system_error(int) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007f2c28730060 in std::thread::join() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#11 0x00007f2c28965e14 in ?? () from /lib/librados.so.2
#12 0x00007f2c1b932cb1 in ceph::common::ConfigProxy::set_mon_vals(ceph::common::CephContext*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, std::function<bool (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>) ()
   from /usr/lib/ceph/libceph-common.so.2
#13 0x00007f2c1b91c3d5 in ?? () from /usr/lib/ceph/libceph-common.so.2
#14 0x00007f2c1b931c85 in boost::asio::detail::strand_service::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) ()
   from /usr/lib/ceph/libceph-common.so.2
#15 0x00007f2c28973ca2 in ?? () from /lib/librados.so.2
#16 0x00007f2c2897adea in ?? () from /lib/librados.so.2
#17 0x00007f2c2872fde4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#18 0x00007f2c2b09b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#19 0x00007f2c2afc0133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

CEPH 配置文件

[global]

#admin socket = /var/run/ceph/guests/-....asok
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
#public network = 10.3.0.0/24
mon_pg_warn_max_per_osd = 128
mon_max_pg_per_osd = 128
osd_recovery_max_active = 32
osd_max_backfills = 32
osd_op_num_threads_per_shard_hdd = 2
rbd_mirror_concurrent_image_deletions = 2
rbd_op_threads = 4
bluestore_compression_mode = aggressive
osd_pool_default_size = 4
osd_pool_default_min_size = 0
cluster_network = 10.3.0.0/24
#cluster_network_interface = ens255f1
#public_network_interface = ens255f1
[client]
rgw frontends = civetweb port=80
rbd cache = true
rbd cache writethrough until flush = true
#admin socket = /var/run/ceph/guests/-....asok
#log file = /var/log/qemu/qemu-guest-.log
rbd concurrent management ops = 20
osd_op_num_threads_per_shard_hdd = 2
rbd_mirror_concurrent_image_deletions = 2
rbd_op_threads = 4
[osd]
osd_recovery_max_active = 32
osd_max_backfills = 32
osd_op_num_threads_per_shard_hdd = 2
rbd_mirror_concurrent_image_deletions = 2
rbd_op_threads = 4
bluestore_compression_mode = aggressive
[mon]
mon allow pool delete = true

答案1

我唯一一次看到 QEMU 发出此确切消息是在配置很奇怪的情况下。本质上 librbd 崩溃了,并因此导致整个 QEMU 进程停止。

可以是类似的配置选项rbd_op_threads(不要在生产中使用此选项,将其保留为默认值1)。

如果不是这样,您能粘贴您的吗ceph.conf

相关内容