Ubuntu 20.04 LTS NetworkManager.service 启动失败

Ubuntu 20.04 LTS NetworkManager.service 启动失败

几个月前,我开始遇到问题NetworkManager.service,无法连接互联网。我会收到 Ubuntu 错误弹出窗口,提示此服务无法启动,但重启电脑可以让它再次正常启动,这种情况并不常见。后来,这种情况开始越来越频繁地发生,每次重启都无法正常工作,我尝试了几次才让它正常启动。我找到一个人说这个命令sudo systemctl restart NetworkManager.service可以让它再次启动,有一段时间这个命令确实奏效了(尽管我几乎每次重启电脑时都必须运行它)。

然而就在今天,这个命令不再起作用,产生了一个错误,现在即使在多次重启和关闭计算机后,我仍然无法从 Ubuntu 连接到互联网:

~$ sudo systemctl restart NetworkManager.service
Job for NetworkManager.service failed because a fatal signal was delivered causing the control process to dump core.
See "systemctl status NetworkManager.service" and "journalctl -xe" for details.

检查它的 systemctl 状态,我得到了以下信息:

~$ systemctl status NetworkManager.service
● NetworkManager.service - Network Manager
     Loaded: loaded (/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
     Active: failed (Result: core-dump) since Sun 2021-06-27 14:40:30 EDT; 2min 9s ago
       Docs: man:NetworkManager(8)
    Process: 3222 ExecStart=/usr/sbin/NetworkManager --no-daemon (code=dumped, signal=BUS)
   Main PID: 3222 (code=dumped, signal=BUS)

Jun 27 14:40:30 user systemd[1]: NetworkManager.service: Scheduled restart job, restart counter is at 5.
Jun 27 14:40:30 user systemd[1]: Stopped Network Manager.
Jun 27 14:40:30 user systemd[1]: NetworkManager.service: Start request repeated too quickly.
Jun 27 14:40:30 user systemd[1]: NetworkManager.service: Failed with result 'core-dump'.
Jun 27 14:40:30 user systemd[1]: Failed to start Network Manager.

至于journalctl -xe输出,我已将所有日志放在此 pastebin 链接中:https://pastebin.com/gTJMktN5 有很多与上述类似的错误表明它因核心转储而失败,但这只是其中一个可能相关的块:

-- A start job for unit NetworkManager.service has begun execution.
-- 
-- The job identifier is 1897.
Jun 27 14:40:28 user kernel: ata4.00: exception Emask 0x0 SAct 0x200000 SErr 0x0 action 0x0
Jun 27 14:40:28 user kernel: ata4.00: irq_stat 0x40000008
Jun 27 14:40:28 user kernel: ata4.00: failed command: READ FPDMA QUEUED
Jun 27 14:40:28 user kernel: ata4.00: cmd 60/08:a8:70:9a:41/00:00:5a:00:00/40 tag 21 ncq dma 4096 in
                                      res 41/40:00:74:9a:41/00:00:5a:00:00/00 Emask 0x409 (media error) <F>
Jun 27 14:40:28 user kernel: ata4.00: status: { DRDY ERR }
Jun 27 14:40:28 user kernel: ata4.00: error: { UNC }
Jun 27 14:40:28 user kernel: ata4.00: configured for UDMA/133
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#21 Sense Key : Medium Error [current] 
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#21 Add. Sense: Unrecovered read error - auto reallocate failed
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#21 CDB: Read(10) 28 00 5a 41 9a 70 00 00 08 00
Jun 27 14:40:28 user kernel: blk_update_request: I/O error, dev sdb, sector 1514248820 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Jun 27 14:40:28 user kernel: ata4: EH complete
Jun 27 14:40:28 user kernel: ata4.00: exception Emask 0x0 SAct 0x4000000 SErr 0x0 action 0x0
Jun 27 14:40:28 user kernel: ata4.00: irq_stat 0x40000008
Jun 27 14:40:28 user kernel: ata4.00: failed command: READ FPDMA QUEUED
Jun 27 14:40:28 user kernel: ata4.00: cmd 60/08:d0:70:9a:41/00:00:5a:00:00/40 tag 26 ncq dma 4096 in
                                      res 41/40:00:74:9a:41/00:00:5a:00:00/00 Emask 0x409 (media error) <F>
Jun 27 14:40:28 user kernel: ata4.00: status: { DRDY ERR }
Jun 27 14:40:28 user kernel: ata4.00: error: { UNC }
Jun 27 14:40:28 user kernel: ata4.00: configured for UDMA/133
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#26 Sense Key : Medium Error [current] 
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#26 Add. Sense: Unrecovered read error - auto reallocate failed
Jun 27 14:40:28 user kernel: sd 3:0:0:0: [sdb] tag#26 CDB: Read(10) 28 00 5a 41 9a 70 00 00 08 00
Jun 27 14:40:28 user kernel: blk_update_request: I/O error, dev sdb, sector 1514248820 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Jun 27 14:40:28 user kernel: ata4: EH complete
Jun 27 14:40:28 user systemd[1]: NetworkManager.service: Main process exited, code=dumped, status=7/BUS
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- An ExecStart= process belonging to unit NetworkManager.service has exited.

我见过类似的帖子,回复说要更新内核版本和其他东西,但我目前正在运行最新版本的 20.04 LTS,我认为我不需要偏离它太多。

我正在运行Ubuntu 20.04.2 LTS x86_64内核:

~$ uname -a
Linux user 5.8.0-59-generic #66~20.04.1-Ubuntu SMP Thu Jun 17 11:14:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

当我收集这些日志时,我还开始频繁遇到以前从未出现过故障的服务的错误弹出窗口。它们针对以下服务:

/usr/libexec/colord
/usr/libexec/tracker-extract
/usr/libexec/tracker-miner-fs
/usr/lib/packagekit/packagekitd

我不知道它们是否有关联,但考虑到它们开始的时间与我使用的重启命令停止工作的时间相同,似乎存在更大的问题。除此之外,重启和关闭计算机会产生错误页面,这些页面滚动得太快,以至于我在关机过程中无法阅读它们。

任何有关调试或寻找解决方法的帮助都将不胜感激。

编辑:

以下是输出grep -i FPDMA /var/log/syslog*https://pastebin.com/tazDug7H

以下是 的输出dmesg。其中有一些 I/O 错误。记录显示,安装驱动器为/dev/sdbhttps://pastebin.com/ctefUjUA

fsck安装驱动器上的输出:

~$ sudo fsck -f /dev/sdb2
fsck from util-linux 2.34
e2fsck 1.45.5 (07-Jan-2020)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb2: 635347/61022208 files (1.4% non-contiguous), 29081215/244059648 blocks

SMART 测试的屏幕截图

答案1

全国资格考试

您遇到了磁盘 NCQ 错误...

Jun 27 14:40:28 user kernel: ata4.00: failed command: READ FPDMA QUEUED
Jun 27 14:40:28 user kernel: ata4.00: cmd 60/08:a8:70:9a:41/00:00:5a:00:00/40 tag 21 ncq dma 4096 in
                                      res 41/40:00:74:9a:41/00:00:5a:00:00/00 Emask 0x409 (media error) <F>
Jun 27 14:40:28 user kernel: ata4.00: status: { DRDY ERR }
Jun 27 14:40:28 user kernel: ata4.00: error: { UNC }

本机命令队列 (NCQ) 是串行 ATA 协议的扩展,允许硬盘驱动器内部优化接收的读写命令的执行顺序。

编辑sudo -H gedit /etc/default/grub并更改以下行以包含此额外参数。然后执行sudo update-grub将更改写入磁盘。重新启动。监控挂起/等,并观察grep -i FPDMA /var/log/syslog*dmesg是否有持续的错误消息。

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash libata.force=noncq"

文件系统检查

  • 以“试用 Ubuntu”模式启动 Ubuntu Live DVD/USB
  • terminalCtrl+ Alt+打开窗口T
  • 类型sudo fdisk -l
  • 识别“Linux 文件系统”的 /dev/sdXX 设备名称
  • 输入sudo fsck -f /dev/sdXX,替换sdXX为您之前找到的数字
  • fsck如果有错误则重复命令
  • 类型reboot

固态硬盘

关于你的SanDisk SSD PLUS 1TB,检查固件更新。访问 SanDisk 网站并下载其Dashboard软件。需要 Windows。

https://kb.sandisk.com/app/answers/detail/a_id/15108/~/dashboard-support-information

更新#1:

尽管 SMART 说 SSD 没问题,但事实并非如此。你必须6146 个无法纠正的错误21388 个无法纠正的 ECC 错误!由于您已经更换了电缆并更新了固件,因此问题要么出在 SATA 端口,要么出在您的SSD 不好

相关内容