将 16.04 服务器更新至 18.04.1,现在我看到一些奇怪的错误,比如文件丢失、无法解析 DNS、无法连接到总线

将 16.04 服务器更新至 18.04.1,现在我看到一些奇怪的错误,比如文件丢失、无法解析 DNS、无法连接到总线

好的,这是从 16.04.1 LTS 升级到 18.04.1 LTS,服务器是无头的。升级完成后,重新启动完成,将发生以下情况:

尝试升级软件包只会导致以下结果:

~$ sudo apt upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  policykit-1 screen smartmontools
3 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 0 B/1,081 kB of archives.
After this operation, 147 kB of additional disk space will be used.
Do you want to continue? [Y/n] y

(Reading database ... 159133 files and directories currently installed.)
Preparing to unpack .../smartmontools_6.5+svn4324-1_amd64.deb ...
Failed to connect to bus: No such file or directory
[...]

它说的是 Dbus 吗?无论如何,恢复一下怎么样?

~$ sudo apt update
Err:1 http://security.ubuntu.com/ubuntu bionic-security InRelease
  Temporary failure resolving 'security.ubuntu.com'
Err:2 http://se.archive.ubuntu.com/ubuntu bionic InRelease
  Temporary failure resolving 'se.archive.ubuntu.com'
Err:3 http://se.archive.ubuntu.com/ubuntu bionic-updates InRelease
  Temporary failure resolving 'se.archive.ubuntu.com'
Err:4 http://se.archive.ubuntu.com/ubuntu bionic-backports InRelease
  Temporary failure resolving 'se.archive.ubuntu.com'
Reading package lists... Done
Building dependency tree
Reading state information... Done
1 package can be upgraded. Run 'apt list --upgradable' to see it.
W: Failed to fetch http://se.archive.ubuntu.com/ubuntu/dists/bionic/InRelease  Temporary failure resolving 'se.archive.ubuntu.com'
W: Failed to fetch http://se.archive.ubuntu.com/ubuntu/dists/bionic-updates/InRelease  Temporary failure resolving 'se.archive.ubuntu.com'
W: Failed to fetch http://se.archive.ubuntu.com/ubuntu/dists/bionic-backports/InRelease  Temporary failure resolving 'se.archive.ubuntu.com'
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/bionic-security/InRelease  Temporary failure resolving 'security.ubuntu.com'
W: Some index files failed to download. They have been ignored, or old ones used instead.

什么?DNS 坏了吗?

~$ wget www.google.com
--2018-11-10 01:55:16--  http://www.google.com/
Resolving www.google.com (www.google.com)... failed: Temporary failure in name resolution.
wget: unable to resolve host address ‘www.google.com’

看起来是的。嗯,ifconfig 没有显示 dns。Google 引导我到这个问题,我们来尝试一下。

~$ systemd-resolve --status
sd_bus_open_system: No such file or directory

好吧,我现在越来越迷茫了。谷歌现在发现这个问题,这是关于docker的,不是我的情况,但也提到了systemd不是pid 1的问题:

~$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.1  0.1 159752  9116 ?        Ss   01:25   0:02 /sbin/init splash nomdmonddf nomdmonisw
root         2  0.0  0.0      0     0 ?        S    01:25   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    01:25   0:00 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S<   01:25   0:00 [kworker/0:0H]
root         7  0.0  0.0      0     0 ?        S    01:25   0:00 [rcu_sched]
root         8  0.0  0.0      0     0 ?        S    01:25   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        S    01:25   0:00 [migration/0]
root        10  0.0  0.0      0     0 ?        S    01:25   0:00 [watchdog/0]
root        11  0.0  0.0      0     0 ?        S    01:25   0:00 [watchdog/1]
[...]

我猜是吧?这是根本问题还是症状?我该如何解决?

答案1

升级到 18.04.01 后,我在一台 OVH VPS 服务器上遇到了同样的错误。升级后,其他非常相似的服务器运行良好,这帮助我解决了这个问题。

进入兔子洞后发现:

sudo systemctl
...
systemd-logind.service     loaded failed failed
...

这让我来到了这个网站: https://forum.proxmox.com/threads/systemd-logind-failures.44219/

我确认我的 /var/run 没有与 /run 建立符号链接,而是在其他服务器上。这个在两个地方都有文件。

我几乎按照建议进行了符号链接,使用以下命令修复每一行中的“目录非空”错误:

mv -f /var/run/sudo/ts/* /run/sudo/ts/; rm -rf /var/run/sudo/ts; 
mv -f /var/run/sudo/* /run/sudo/; rm -rf /var/run/sudo; 
mv -f /var/run/* /run/; rm -rf /var/run; 
ln -s /run /var/run;
reboot

此步骤解决了部分问题,例如:

$ systemd-resolve --status

但 DNS 解析仍然失败,但使用 resolve 可以正常工作:

$ ping google.com
ping: google.com: Temporary failure in name resolution

$ systemd-resolve google.com
google.com: 216.58.213.174

根据建议https://superuser.com/questions/1317623/nslookup-failed-but-systemd-resolved-works

我将 /etc/resolv.conf 符号链接从 更改/run/resolvconf/resolv.conf/run/systemd/resolve/resolv.conf

$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 29 Feb 24  2017 /etc/resolv.conf -> ../run/resolvconf/resolv.conf

$ sudo rm /etc/resolv.conf
$ sudo ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf

$ ping google.com
PING google.com (216.58.204.238) 56(84) bytes of data.
64 bytes from par21s06-in-f14.1e100.net (216.58.204.238): icmp_seq=1 ttl=53 time=9.11 ms

现在一切正常。

相关内容