我的 Debian 10 系统会随机出现死机,这迫使我必须使用电源按钮将其关闭才能执行任何操作。过去几周内,这种死机现象反复发生。
输出uname -a
:
Linux debian 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 GNU/Linux
硬件:Threadripper 2970 WX、32GB G Skill F4-3200C1 RAM、MSI MEG X399 Creation 主板
磁盘:1TB Samsung SSD(主磁盘,由 LVM 管理);4TB WD RED(通过 UUID 安装);3x 8TB Seagate Ironwolf(由 LVM 管理)
系统上运行的特殊软件:KVM
我已经尝试过的:
- 使用反向移植将内核更新至 5.10
- Memtest86(目前已运行 6 小时,无任何错误)
- 检查日志文件(目前还没有帮助我)
- 安装
kdump-tools
(冻结时未触发) - CPU 压力测试在 100% 下持续一小时(无冻结。注意:在冻结期间,CPU 大部分时间仅以 5% 运行,并且还有大量可用 RAM)。
系统日志:
Dec 4 11:54:12 debian systemd[1]: bacula-director.service: Service RestartSec=1min expired, scheduling restart.
Dec 4 11:54:12 debian systemd[1]: bacula-director.service: Scheduled restart job, restart counter is at 1783.
Dec 4 11:54:12 debian systemd[1]: Stopped Bacula Director Daemon service.
Dec 4 11:54:12 debian systemd[1]: Starting Bacula Director Daemon service...
Dec 4 11:54:42 debian bacula-dir[124998]: bacula-dir: dird.c:1229-0 Could not open Catalog "MyCatalog", database "XXX_DBNAME_XXX".
Dec 4 11:54:42 debian bacula-dir[124998]: bacula-dir: dird.c:1234-0 postgresql.c:332 Unable to connect to PostgreSQL server. Database=XXX_DBNAME_XXX User=XXX_DBUSER_XXX
Dec 4 11:54:42 debian bacula-dir[124998]: Possible causes: SQL server not running; password incorrect; max_connections exceeded.
Dec 4 11:54:42 debian bacula-dir[124998]: 04-Dec 11:54 bacula-dir ERROR TERMINATION
Dec 4 11:54:42 debian bacula-dir[124998]: Please correct configuration file: /etc/bacula/bacula-dir.conf
Dec 4 11:54:42 debian systemd[1]: bacula-director.service: Control process exited, code=exited, status=1/FAILURE
Dec 4 11:54:42 debian systemd[1]: bacula-director.service: Failed with result 'exit-code'.
Dec 4 11:54:42 debian systemd[1]: Failed to start Bacula Director Daemon service.
Dec 4 11:55:01 debian CRON[125097]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Dec 4 11:55:23 debian NetworkManager[1494]: <info> [1638636923.5313] device (wlp5s0): set-hw-addr: set MAC address to 76:22:F8:22:9D:00 (scanning)
Dec 4 11:55:23 debian kernel: [161004.635913] IPv6: ADDRCONF(NETDEV_UP): wlp5s0: link is not ready
Dec 4 11:55:23 debian NetworkManager[1494]: <info> [1638636923.6007] device (wlp5s0): supplicant interface state: inactive -> disconnected
Dec 4 11:55:23 debian NetworkManager[1494]: <info> [1638636923.6058] device (wlp5s0): supplicant interface state: disconnected -> inactive
Dec 4 11:55:23 debian wpa_supplicant[1493]: wlp5s0: Reject scan trigger since one is already pending
Dec 4 11:55:42 debian systemd[1]: bacula-director.service: Service RestartSec=1min expired, scheduling restart.
Dec 4 11:55:42 debian systemd[1]: bacula-director.service: Scheduled restart job, restart counter is at 1784.
Dec 4 11:55:42 debian systemd[1]: Stopped Bacula Director Daemon service.
Dec 4 11:55:42 debian systemd[1]: Starting Bacula Director Daemon service...
Dec 4 11:56:00 debian libvirtd[1750]: internal error: End of file from qemu monitor
Dec 4 11:56:01 debian kernel: [161042.914669] audit: type=1400 audit(1638636961.809:55): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="libvirt-4c626220-3780-4f2f-b2f1-0da779a85f8f" pid=125291 comm="apparmor_parser"
Dec 4 11:56:01 debian avahi-daemon[1487]: Interface macvtap2.IPv6 no longer relevant for mDNS.
Dec 4 11:56:01 debian avahi-daemon[1487]: Leaving mDNS multicast group on interface macvtap2.IPv6 with address fe80::5054:ff:fe7e:8739.
Dec 4 11:56:01 debian avahi-daemon[1487]: Withdrawing address record for fe80::5054:ff:fe7e:8739 on macvtap2.
Dec 4 11:56:08 debian kernel: [161050.046357] audit: type=1400 audit(1638636968.942:56): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125301 comm="apparmor_parser"
Dec 4 11:56:09 debian kernel: [161050.203334] audit: type=1400 audit(1638636969.098:57): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125304 comm="apparmor_parser"
Dec 4 11:56:09 debian kernel: [161050.336065] audit: type=1400 audit(1638636969.230:58): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125307 comm="apparmor_parser"
Dec 4 11:56:09 debian kernel: [161050.485278] audit: type=1400 audit(1638636969.378:59): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125310 comm="apparmor_parser"
Dec 4 11:56:09 debian kernel: [161050.492080] virbr1: port 2(vnet0) entered blocking state
Dec 4 11:56:09 debian kernel: [161050.492081] virbr1: port 2(vnet0) entered disabled state
Dec 4 11:56:09 debian kernel: [161050.492140] device vnet0 entered promiscuous mode
Dec 4 11:56:09 debian kernel: [161050.492304] virbr1: port 2(vnet0) entered blocking state
Dec 4 11:56:09 debian kernel: [161050.492306] virbr1: port 2(vnet0) entered listening state
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.3920] manager: (vnet0): new Tun device (/org/freedesktop/NetworkManager/Devices/20)
Dec 4 11:56:09 debian systemd-udevd[125312]: Using default interface naming scheme 'v240'.
Dec 4 11:56:09 debian systemd-udevd[125312]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4246] device (vnet0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4286] keyfile: add connection /run/NetworkManager/system-connections/vnet0.nmconnection (db2c5b63-ff10-4b2d-8690-c6d5b484cb6d,"vnet0")
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4309] device (vnet0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4318] device (vnet0): Activation: starting connection 'vnet0' (db2c5b63-ff10-4b2d-8690-c6d5b484cb6d)
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4319] device (vnet0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4323] device (vnet0): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4325] device (vnet0): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4327] device (virbr1): bridge port vnet0 was attached
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4327] device (vnet0): Activation: connection 'vnet0' enslaved, continuing activation
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4329] device (vnet0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4396] device (vnet0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4400] device (vnet0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.4505] device (vnet0): Activation: successful, device activated.
Dec 4 11:56:09 debian dbus-daemon[1491]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.7' (uid=0 pid=1494 comm="/usr/sbin/NetworkManager --no-daemon ")
Dec 4 11:56:09 debian systemd[1]: Starting Network Manager Script Dispatcher Service...
Dec 4 11:56:09 debian dbus-daemon[1491]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec 4 11:56:09 debian systemd[1]: Started Network Manager Script Dispatcher Service.
Dec 4 11:56:09 debian nm-dispatcher: req:1 'up' [vnet0]: new request (1 scripts)
Dec 4 11:56:09 debian nm-dispatcher: req:1 'up' [vnet0]: start running ordered scripts...
Dec 4 11:56:09 debian kernel: [161050.633003] audit: type=1400 audit(1638636969.526:60): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125320 comm="apparmor_parser"
Dec 4 11:56:09 debian systemd-udevd[125321]: Using default interface naming scheme 'v240'.
Dec 4 11:56:09 debian kernel: [161050.635853] virbr2: port 2(vnet1) entered blocking state
Dec 4 11:56:09 debian kernel: [161050.635856] virbr2: port 2(vnet1) entered disabled state
Dec 4 11:56:09 debian kernel: [161050.635976] device vnet1 entered promiscuous mode
Dec 4 11:56:09 debian kernel: [161050.636217] virbr2: port 2(vnet1) entered blocking state
Dec 4 11:56:09 debian kernel: [161050.636219] virbr2: port 2(vnet1) entered listening state
Dec 4 11:56:09 debian systemd-udevd[125321]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5352] manager: (vnet1): new Tun device (/org/freedesktop/NetworkManager/Devices/21)
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5510] device (vnet1): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5534] keyfile: add connection /run/NetworkManager/system-connections/vnet1.nmconnection (468ebb82-a253-4d81-a54c-911564d3f4d0,"vnet1")
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5541] device (vnet1): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5549] device (vnet1): Activation: starting connection 'vnet1' (468ebb82-a253-4d81-a54c-911564d3f4d0)
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5550] device (vnet1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5555] device (vnet1): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5558] device (vnet1): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5559] device (virbr2): bridge port vnet1 was attached
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5560] device (vnet1): Activation: connection 'vnet1' enslaved, continuing activation
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5561] device (vnet1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5567] device (vnet1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5570] device (vnet1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:09 debian NetworkManager[1494]: <info> [1638636969.5648] device (vnet1): Activation: successful, device activated.
Dec 4 11:56:09 debian nm-dispatcher: req:2 'up' [vnet1]: new request (1 scripts)
Dec 4 11:56:09 debian nm-dispatcher: req:2 'up' [vnet1]: start running ordered scripts...
Dec 4 11:56:09 debian kernel: [161050.792545] audit: type=1400 audit(1638636969.686:61): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125367 comm="apparmor_parser"
Dec 4 11:56:09 debian libvirtd[1750]: Domain id=6 name='Whonix-Gateway' uuid=e9b93fae-7ee0-4096-b496-208aa0be517a is tainted: host-cpu
Dec 4 11:56:10 debian avahi-daemon[1487]: Joining mDNS multicast group on interface vnet0.IPv6 with address fe80::fc54:ff:fe05:b4b9.
Dec 4 11:56:10 debian avahi-daemon[1487]: New relevant interface vnet0.IPv6 for mDNS.
Dec 4 11:56:10 debian avahi-daemon[1487]: Registering new address record for fe80::fc54:ff:fe05:b4b9 on vnet0.*.
Dec 4 11:56:10 debian avahi-daemon[1487]: Joining mDNS multicast group on interface vnet1.IPv6 with address fe80::fc54:ff:fe06:4a00.
Dec 4 11:56:10 debian avahi-daemon[1487]: New relevant interface vnet1.IPv6 for mDNS.
Dec 4 11:56:10 debian avahi-daemon[1487]: Registering new address record for fe80::fc54:ff:fe06:4a00 on vnet1.*.
Dec 4 11:56:11 debian kernel: [161052.517316] virbr1: port 2(vnet0) entered learning state
Dec 4 11:56:11 debian kernel: [161052.645328] virbr2: port 2(vnet1) entered learning state
Dec 4 11:56:12 debian kernel: [161053.267983] audit: type=1400 audit(1638636972.162:62): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-a9008a46-7469-47fc-8bcc-4449ae8f2ee8" pid=125417 comm="apparmor_parser"
Dec 4 11:56:12 debian kernel: [161053.446910] audit: type=1400 audit(1638636972.342:63): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-a9008a46-7469-47fc-8bcc-4449ae8f2ee8" pid=125420 comm="apparmor_parser"
Dec 4 11:56:12 debian bacula-dir[125155]: bacula-dir: dird.c:1229-0 Could not open Catalog "MyCatalog", database "XXX_DBNAME_XXX".
Dec 4 11:56:12 debian bacula-dir[125155]: bacula-dir: dird.c:1234-0 postgresql.c:332 Unable to connect to PostgreSQL server. Database=XXX_DBNAME_XXX User=XXX_DBUSER_XXX
Dec 4 11:56:12 debian bacula-dir[125155]: Possible causes: SQL server not running; password incorrect; max_connections exceeded.
Dec 4 11:56:12 debian bacula-dir[125155]: 04-Dec 11:56 bacula-dir ERROR TERMINATION
Dec 4 11:56:12 debian bacula-dir[125155]: Please correct configuration file: /etc/bacula/bacula-dir.conf
Dec 4 11:56:12 debian systemd[1]: bacula-director.service: Control process exited, code=exited, status=1/FAILURE
Dec 4 11:56:12 debian systemd[1]: bacula-director.service: Failed with result 'exit-code'.
Dec 4 11:56:12 debian systemd[1]: Failed to start Bacula Director Daemon service.
Dec 4 11:56:12 debian kernel: [161053.582800] audit: type=1400 audit(1638636972.478:64): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-a9008a46-7469-47fc-8bcc-4449ae8f2ee8" pid=125423 comm="apparmor_parser"
Dec 4 11:56:12 debian kernel: [161053.721423] audit: type=1400 audit(1638636972.618:65): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-a9008a46-7469-47fc-8bcc-4449ae8f2ee8" pid=125426 comm="apparmor_parser"
Dec 4 11:56:12 debian kernel: [161053.729968] virbr2: port 3(vnet2) entered blocking state
Dec 4 11:56:12 debian kernel: [161053.729971] virbr2: port 3(vnet2) entered disabled state
Dec 4 11:56:12 debian kernel: [161053.730078] device vnet2 entered promiscuous mode
Dec 4 11:56:12 debian kernel: [161053.730368] virbr2: port 3(vnet2) entered blocking state
Dec 4 11:56:12 debian kernel: [161053.730370] virbr2: port 3(vnet2) entered listening state
Dec 4 11:56:12 debian systemd-udevd[125321]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6296] manager: (vnet2): new Tun device (/org/freedesktop/NetworkManager/Devices/22)
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6448] device (vnet2): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6475] keyfile: add connection /run/NetworkManager/system-connections/vnet2.nmconnection (efc91c6d-6f2c-46db-a731-12e0e3dd38b6,"vnet2")
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6484] device (vnet2): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6493] device (vnet2): Activation: starting connection 'vnet2' (efc91c6d-6f2c-46db-a731-12e0e3dd38b6)
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6514] device (vnet2): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6519] device (vnet2): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6522] device (vnet2): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6524] device (virbr2): bridge port vnet2 was attached
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6524] device (vnet2): Activation: connection 'vnet2' enslaved, continuing activation
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6526] device (vnet2): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6531] device (vnet2): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6533] device (vnet2): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Dec 4 11:56:12 debian NetworkManager[1494]: <info> [1638636972.6636] device (vnet2): Activation: successful, device activated.
Dec 4 11:56:12 debian nm-dispatcher: req:3 'up' [vnet2]: new request (1 scripts)
Dec 4 11:56:12 debian nm-dispatcher: req:3 'up' [vnet2]: start running ordered scripts...
Dec 4 11:56:12 debian libvirtd[1750]: Domain id=7 name='kaliwhonix' uuid=a9008a46-7469-47fc-8bcc-4449ae8f2ee8 is tainted: host-cpu
Dec 4 11:56:13 debian NetworkManager[1494]: <info> [1638636973.4320] device (virbr1): carrier: link connected
Dec 4 11:56:13 debian kernel: [161054.533233] virbr1: port 2(vnet0) entered forwarding state
Dec 4 11:56:13 debian kernel: [161054.533235] virbr1: topology change detected, propagating
Dec 4 11:56:13 debian NetworkManager[1494]: <info> [1638636973.5601] device (virbr2): carrier: link connected
Dec 4 11:56:13 debian kernel: [161054.661230] virbr2: port 2(vnet1) entered forwarding state
Dec 4 11:56:13 debian kernel: [161054.661233] virbr2: topology change detected, propagating
Dec 4 11:56:14 debian avahi-daemon[1487]: Joining mDNS multicast group on interface vnet2.IPv6 with address fe80::fc54:ff:fe55:218.
Dec 4 11:56:14 debian avahi-daemon[1487]: New relevant interface vnet2.IPv6 for mDNS.
Dec 4 11:56:14 debian avahi-daemon[1487]: Registering new address record for fe80::fc54:ff:fe55:218 on vnet2.*.
Dec 4 11:56:14 debian kernel: [161055.749233] virbr2: port 3(vnet2) entered learning state
Dec 4 11:56:16 debian kernel: [161057.765144] virbr2: port 3(vnet2) entered forwarding state
Dec 4 11:56:16 debian kernel: [161057.765146] virbr2: topology change detected, propagating
Dec 4 11:56:22 debian systemd[1]: NetworkManager-dispatcher.service: Succeeded.
Dec 4 11:57:12 debian systemd[1]: bacula-director.service: Service RestartSec=1min expired, scheduling restart.
Dec 4 11:57:12 debian systemd[1]: bacula-director.service: Scheduled restart job, restart counter is at 1785.
Dec 4 11:57:12 debian systemd[1]: Stopped Bacula Director Daemon service.
Dec 4 11:57:12 debian systemd[1]: Starting Bacula Director Daemon service...
Dec 4 11:57:42 debian bacula-dir[125609]: bacula-dir: dird.c:1229-0 Could not open Catalog "MyCatalog", database "XXX_DBNAME_XXX".
Dec 4 11:57:42 debian bacula-dir[125609]: bacula-dir: dird.c:1234-0 postgresql.c:332 Unable to connect to PostgreSQL server. Database=XXX_DBNAME_XXX User=XXX_DBUSER_XXX
Dec 4 11:57:42 debian bacula-dir[125609]: Possible causes: SQL server not running; password incorrect; max_connections exceeded.
Dec 4 11:57:42 debian bacula-dir[125609]: 04-Dec 11:57 bacula-dir ERROR TERMINATION
Dec 4 11:57:42 debian bacula-dir[125609]: Please correct configuration file: /etc/bacula/bacula-dir.conf
Dec 4 11:57:42 debian systemd[1]: bacula-director.service: Control process exited, code=exited, status=1/FAILURE
Dec 4 11:57:42 debian systemd[1]: bacula-director.service: Failed with result 'exit-code'.
Dec 4 11:57:42 debian systemd[1]: Failed to start Bacula Director Daemon service.
Dec 4 11:58:28 debian avahi-daemon[1487]: Interface vnet2.IPv6 no longer relevant for mDNS.
Dec 4 11:58:28 debian avahi-daemon[1487]: Leaving mDNS multicast group on interface vnet2.IPv6 with address fe80::fc54:ff:fe55:218.
Dec 4 11:58:28 debian kernel: [161189.530902] virbr2: port 3(vnet2) entered disabled state
Dec 4 11:58:28 debian kernel: [161189.532432] device vnet2 left promiscuous mode
Dec 4 11:58:28 debian kernel: [161189.532438] virbr2: port 3(vnet2) entered disabled state
Dec 4 11:58:28 debian avahi-daemon[1487]: Withdrawing address record for fe80::fc54:ff:fe55:218 on vnet2.
Dec 4 11:58:28 debian NetworkManager[1494]: <info> [1638637108.4738] device (vnet2): state change: activated -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed')
Dec 4 11:58:28 debian NetworkManager[1494]: <info> [1638637108.4739] device (virbr2): bridge port vnet2 was detached
Dec 4 11:58:28 debian NetworkManager[1494]: <info> [1638637108.4740] device (vnet2): released from master device virbr2
Dec 4 11:58:28 debian dbus-daemon[1491]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.7' (uid=0 pid=1494 comm="/usr/sbin/NetworkManager --no-daemon ")
Dec 4 11:58:28 debian systemd[1]: Starting Network Manager Script Dispatcher Service...
Dec 4 11:58:28 debian dbus-daemon[1491]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec 4 11:58:28 debian systemd[1]: Started Network Manager Script Dispatcher Service.
Dec 4 11:58:28 debian nm-dispatcher: req:1 'down' [vnet2]: new request (1 scripts)
Dec 4 11:58:28 debian nm-dispatcher: req:1 'down' [vnet2]: start running ordered scripts...
Dec 4 11:58:28 debian libvirtd[1750]: internal error: End of file from qemu monitor
Dec 4 11:58:28 debian kernel: [161189.955670] kauditd_printk_skb: 1 callbacks suppressed
Dec 4 11:58:28 debian kernel: [161189.955672] audit: type=1400 audit(1638637108.854:67): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="libvirt-a9008a46-7469-47fc-8bcc-4449ae8f2ee8" pid=125761 comm="apparmor_parser"
Dec 4 11:58:33 debian avahi-daemon[1487]: Interface vnet0.IPv6 no longer relevant for mDNS.
Dec 4 11:58:33 debian avahi-daemon[1487]: Leaving mDNS multicast group on interface vnet0.IPv6 with address fe80::fc54:ff:fe05:b4b9.
Dec 4 11:58:33 debian kernel: [161194.626365] virbr1: port 2(vnet0) entered disabled state
Dec 4 11:58:33 debian kernel: [161194.627405] device vnet0 left promiscuous mode
Dec 4 11:58:33 debian kernel: [161194.627410] virbr1: port 2(vnet0) entered disabled state
Dec 4 11:58:33 debian avahi-daemon[1487]: Withdrawing address record for fe80::fc54:ff:fe05:b4b9 on vnet0.
Dec 4 11:58:33 debian NetworkManager[1494]: <info> [1638637113.5705] device (vnet0): state change: activated -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed')
Dec 4 11:58:33 debian NetworkManager[1494]: <info> [1638637113.5706] device (virbr1): bridge port vnet0 was detached
Dec 4 11:58:33 debian NetworkManager[1494]: <info> [1638637113.5706] device (vnet0): released from master device virbr1
Dec 4 11:58:33 debian nm-dispatcher: req:2 'down' [vnet0]: new request (1 scripts)
Dec 4 11:58:33 debian nm-dispatcher: req:2 'down' [vnet0]: start running ordered scripts...
Dec 4 11:58:33 debian avahi-daemon[1487]: Interface vnet1.IPv6 no longer relevant for mDNS.
Dec 4 11:58:33 debian avahi-daemon[1487]: Leaving mDNS multicast group on interface vnet1.IPv6 with address fe80::fc54:ff:fe06:4a00.
Dec 4 11:58:33 debian kernel: [161194.709691] virbr2: port 2(vnet1) entered disabled state
Dec 4 11:58:33 debian kernel: [161194.711157] device vnet1 left promiscuous mode
Dec 4 11:58:33 debian kernel: [161194.711160] virbr2: port 2(vnet1) entered disabled state
Dec 4 11:58:33 debian avahi-daemon[1487]: Withdrawing address record for fe80::fc54:ff:fe06:4a00 on vnet1.
Dec 4 11:58:33 debian NetworkManager[1494]: <info> [1638637113.6504] device (vnet1): state change: activated -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed')
Dec 4 11:58:33 debian NetworkManager[1494]: <info> [1638637113.6505] device (virbr2): bridge port vnet1 was detached
Dec 4 11:58:33 debian NetworkManager[1494]: <info> [1638637113.6506] device (vnet1): released from master device virbr2
Dec 4 11:58:33 debian nm-dispatcher: req:3 'down' [vnet1]: new request (1 scripts)
Dec 4 11:58:33 debian nm-dispatcher: req:3 'down' [vnet1]: start running ordered scripts...
Dec 4 11:58:33 debian libvirtd[1750]: internal error: End of file from qemu monitor
Dec 4 11:58:34 debian kernel: [161195.162429] audit: type=1400 audit(1638637114.062:68): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="libvirt-e9b93fae-7ee0-4096-b496-208aa0be517a" pid=125798 comm="apparmor_parser"
Dec 4 11:58:38 debian kernel: [161200.026312] audit: type=1400 audit(1638637118.926:69): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-bc32a202-4a7e-45ca-a2f3-e55e78ef8998" pid=125805 comm="apparmor_parser"
Dec 4 11:58:39 debian kernel: [161200.177640] audit: type=1400 audit(1638637119.078:70): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-bc32a202-4a7e-45ca-a2f3-e55e78ef8998" pid=125808 comm="apparmor_parser"
Dec 4 11:58:39 debian kernel: [161200.327978] audit: type=1400 audit(1638637119.226:71): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-bc32a202-4a7e-45ca-a2f3-e55e78ef8998" pid=125811 comm="apparmor_parser"
Dec 4 11:58:39 debian kernel: [161200.475333] audit: type=1400 audit(1638637119.374:72): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-bc32a202-4a7e-45ca-a2f3-e55e78ef8998" pid=125814 comm="apparmor_parser"
Dec 4 11:58:39 debian NetworkManager[1494]: <info> [1638637119.3855] manager: (macvtap2): new Macvlan device (/org/freedesktop/NetworkManager/Devices/23)
Dec 4 11:58:39 debian systemd-udevd[125816]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Dec 4 11:58:39 debian systemd-udevd[125816]: Using default interface naming scheme 'v240'.
Dec 4 11:58:39 debian kernel: [161200.621739] audit: type=1400 audit(1638637119.522:73): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-bc32a202-4a7e-45ca-a2f3-e55e78ef8998" pid=125826 comm="apparmor_parser"
Dec 4 11:58:39 debian NetworkManager[1494]: <info> [1638637119.6625] device (macvtap2): carrier: link connected
Dec 4 11:58:41 debian avahi-daemon[1487]: Joining mDNS multicast group on interface macvtap2.IPv6 with address fe80::5054:ff:fe7c:a067.
Dec 4 11:58:41 debian avahi-daemon[1487]: New relevant interface macvtap2.IPv6 for mDNS.
Dec 4 11:58:41 debian avahi-daemon[1487]: Registering new address record for fe80::5054:ff:fe7c:a067 on macvtap2.*.
Dec 4 11:58:42 debian systemd[1]: bacula-director.service: Service RestartSec=1min expired, scheduling restart.
Dec 4 11:58:42 debian systemd[1]: bacula-director.service: Scheduled restart job, restart counter is at 1786.
Dec 4 11:58:42 debian systemd[1]: Stopped Bacula Director Daemon service.
Dec 4 11:58:42 debian systemd[1]: Starting Bacula Director Daemon service...
Dec 4 11:58:43 debian systemd[1]: NetworkManager-dispatcher.service: Succeeded.
我还发现了:
https://superuser.com/questions/954262/why-do-damaged-hard-drives-freeze-the-entire-system
可能是磁盘导致的?冻结发生在大量磁盘 IO 期间(我正在开发的服务正在写入大量数据)。我订购了一些 PCIe SATA 控制器,尝试将外部磁盘插入与主磁盘不同的控制器中。我还能做些什么来解决这个问题?
我运行了smartctl -a
这些磁盘,它们没有显示任何死机驱动器。我宁愿假设这些错误只是所有这些 IO 请求中的小故障,如果它没有冻结整个系统,这不会是一个大问题。
我无法逐个拔下硬件并查看会发生什么(有时这些冻结发生在一小时后,有时是一天,有时是一周。该服务需要外部磁盘才能正常运行,因此我必须关闭所有设备进行数周的调试。我希望还有其他方法可以找出问题所在)。
任何帮助都非常感谢。
答案1
我将所有硬盘插入第二个 SATA 控制器,问题就解决了。整整一个月都没有再出现死机。显然我的理论是正确的;我假设高负载硬盘活动期间的某些超时会导致整个控制器死机,从而阻止对交换的访问。禁用交换后,也没有出现死机。