为什么每次启动后都会丢失 /var/run/sshd?

为什么每次启动后都会丢失 /var/run/sshd?

我在 Proxmox 5.2-11 下运行 Ubuntu 16.04 容器。应用最新一轮补丁后1我无法在控制台或通过 ssh 登录。

我在虚拟机管理程序上安装了容器根文件系统,并添加pts/0/etc/security/access.conf(我们运行pam_access),这允许 root 登录到控制台。我们已经root : lxc/tty0 lxc/tty1 lxc/tty2access.conf其中,我认为已经足够了,所以我pts/0现在需要什么就令人费解了。

我注意到 ssh 没有运行,因此尝试手动启动它(/usr/sbin/sshd -DDD -f /etc/ssh/sshd_config)并收到此错误:

Missing privilege separation directory: /var/run/sshd

我手动创建了目录,启动ssh后终于可以登录了,但重启后问题仍然存在。目录没有被创建。只有有用的部分,journalctl唯一有趣的部分是关于“操作不允许”的内容,但没有更多信息。

我对 16.04 不太熟悉,所以想知道如何才能找到有关该问题的更多信息。我没有/var/log/syslog/var/log/messages只是kern.log有点迷茫。

1

systemd-sysv 229-4ubuntu21.9
libpam-systemd 229-4ubuntu21.9
libsystemd0 229-4ubuntu21.9
systemd 229-4ubuntu21.9
udev 229-4ubuntu21.9
libudev1 229-4ubuntu21.9
iproute2 4.3.0-1ubuntu3.16.04.4
libsasl2-modules-db 2.1.26.dfsg1-14ubuntu0.1
libsasl2-2 2.1.26.dfsg1-14ubuntu0.1
ldap-utils 2.4.42dfsg-2ubuntu3.4
libldap-2.4-2 2.4.42dfsg-2ubuntu3.4
libsasl2-modules 2.1.26.dfsg1-14ubuntu0.1
libgs9-common 9.25dfsg1-0ubuntu0.16.04.3
ghostscript 9.25dfsg1-0ubuntu0.16.04.3
libgs9 9.25dfsg1-0ubuntu0.16.04.3

[2]

Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[474]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:48 host16 mysqld_safe[495]: Starting mysqld daemon with databases from /var/lib/mysql/mysql
Nov 27 10:13:48 host16 mysqld[500]: 181127 10:13:48 [Note] /usr/sbin/mysqld (mysqld 10.0.36-MariaDB-0ubuntu0.16.04.1) starting as process 499 ...
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:48 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: Failed to reset devices.list on /system.slice/ssh.service: Operation not permitted
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[502]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:48 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: Failed to reset devices.list on /system.slice/ssh.service: Operation not permitted
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[503]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:48 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: Failed to reset devices.list on /system.slice/ssh.service: Operation not permitted
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[504]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:49 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Start request repeated too quickly.
Nov 27 10:13:49 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Failed with result 'start-limit-hit'.
Nov 27 10:13:49 host16 systemd[1]: Started /etc/rc.local Compatibility.
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/plymouth-quit.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Starting Terminate Plymouth Boot Screen...
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/plymouth-quit-wait.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Starting Hold until boot process finishes up...
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/rc-local.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Started Hold until boot process finishes up.
Nov 27 10:13:49 host16 systemd[1]: Started Container Getty on /dev/pts/1.
Nov 27 10:13:49 host16 systemd[1]: Started Container Getty on /dev/pts/0.
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/console-getty.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Started Console Getty.
Nov 27 10:13:49 host16 systemd[1]: Reached target Login Prompts.
Nov 27 10:13:49 host16 systemd[1]: Started Terminate Plymouth Boot Screen.
Nov 27 10:13:52 host16 nslcd[338]: accepting connections
Nov 27 10:13:52 host16 nslcd[275]:    ...done.
Nov 27 10:13:52 host16 systemd[1]: Started LSB: LDAP connection daemon.
Nov 27 10:13:52 host16 systemd[1]: Failed to reset devices.list on /system.slice/cron.service: Operation not permitted
Nov 27 10:13:52 host16 systemd[1]: Started Regular background program processing daemon.
Nov 27 10:13:52 host16 systemd[1]: Failed to reset devices.list on /system.slice/atd.service: Operation not permitted

添加systemd-tmpfiles --create输出

真奇怪……我检查过了/tmp,这些文件不存在 在此处输入图片描述

答案1

因此,每次重启时都会重新创建 /run(以及与其符号链接的 /var/run)。但 systemd-tmpfiles 不会对某些文件(包括 (/var)/run/sshd)执行此操作。

显然,这个问题已通过 OpenVZ 内核升级得到解决。但现在要想真正修复它,您需要编辑/usr/lib/tmpfiles.d/sshd.conf并删除/var以下行,d /var/run/sshd 0755 root root改为: d /run/sshd 0755 root root

就是这样..!

当 openssh-server 升级时,我们希望他们能够修复这个错误(或者它真的是 systemd 或 openvz 中的错误?)——否则您可能会遇到同样的问题。

答案2

sshd您犯的一个错误是尝试手动启动。

如果您改为sshd通过官方方式启动,它应该可以正常工作。该service命令知道在您的发行版上启动服务的正确方法是什么,并且这应该可以工作:

service ssh start

对于 sysv init 脚本,这就是您需要做的一切。缺少目录的原因是它/var/run是符号链接/run并且/runtmpfs挂载点。这意味着每次启动时/var/run都会从空开始。当您使用该service命令时/etc/init.d/ssh,将使用该脚本来启动,但在执行此操作之前,如果脚本不存在,sshd则会创建该脚本。/var/run/sshd

事情systemd有点不同。将有一个名为的文件,/usr/lib/tmpfiles.d/sshd.conf其内容如下:

d /var/run/sshd 0755 root root

在启动过程中,这应该会导致/var/run/sshd目录被创建。您需要验证文件是否存在且内容正确。如果目录/var/run/sshd仍然缺失,您可以在systemd-tmpfiles --create手动运行时验证它是否被创建。

答案3

显然,运行 OpenVZ 内核 2.6.32-042stab134.7 或更新版本时,这个问题会得到解决。我觉得很奇怪,systemd 启动脚本中竟然没有修复方法。可能一个丑陋的黑客会起作用,比如在启动后自动创建 /run/sshd/,然后启动 sshd。

我的输出systemd-tmpfiles --create

[/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
fchownat() of /run/named failed: Invalid argument
Failed to openat(/dev/simfs): Operation not permitted
Failed to validate path /var/run/screen: Too many levels of symbolic links
Failed to validate path /var/run/sshd: Too many levels of symbolic links
Failed to validate path /var/run/sudo: Too many levels of symbolic links
Failed to validate path /var/run/sudo/ts: Too many levels of symbolic links
fchownat() of /run/systemd/netif failed: Invalid argument
fchownat() of /run/systemd/netif/links failed: Invalid argument
fchownat() of /run/systemd/netif/leases failed: Invalid argument
fchownat() of /run/log/journal failed: Invalid argument
fchownat() of /run/log/journal/e9e1d08bc42c48999865b96c250f40cc failed: Invalid argument
fchownat() of /run/log/journal/e9e1d08bc42c48999865b96c250f40cc/system.journal failed: Invalid argument

OpenVZ 2.6.32-042stab134.7 的更新日志如下:

使用 systemd 229-4ubuntu21.9 运行 Ubuntu 容器可能会导致服务无法启动,因为 systemd-tmpfiles 因符号链接问题而无法验证路径。(PSBM-90038)

答案4

我也遇到过这种情况。我遇到的问题是 ssh.socket 不知怎么就启用了。禁用 ssh.socket 后,ssh.service 确实会在启动时正常启动。

相关内容