安装失败无提示 - 如何调试和修复?

安装失败无提示 - 如何调试和修复?

我有一个(n MD)设备,/dev/md1带有 ext2 for /boot.但是,我无法将其安装在/boot(所属位置):

# mount | grep boot
Exit 1
# mount | grep md1
Exit 1

# fsck /dev/md1
fsck from util-linux 2.23.2
e2fsck 1.42.9 (28-Dec-2013)
/dev/md1: clean, 362/40960 files, 121097/163696 blocks

# mount -v /dev/md1 /boot
mount: /dev/md1 mounted on /boot.
# ll /boot/
total 0
# umount -v /boot
umount: /boot: not mounted
# umount /dev/md1
umount: /dev/md1: not mounted

将设备安装到其他地方可以:

# ll /mnt/tmp/ | wc 
      1       2       8
# mount -v /dev/md1 /mnt/tmp
mount: /dev/md1 mounted on /mnt/tmp.
# ll /mnt/tmp/ | wc 
     35     308    2811
# umount /mnt/tmp
umount: /mnt/tmp (/dev/md1) unmounted
umount: /mnt/tmp: not mounted
Exit 32

可能是什么问题?

看到后更新这个类似的问题:有东西正在立即卸载我们的分区:

# mount -v /dev/md1 /boot ; ll /boot/ | wc; sleep 0.1; ll /boot/| wc 
mount: /dev/md1 mounted on /boot.
     35     308    2811
      1       2       8

systemd可能会在后台卸载。我还是不知道该怎么办。 systemctl daemon-reload没有帮助。systemctl status不包含字符串/bootmount.

更新2:

# systemctl list-unit-files -t mount
UNIT FILE                     STATE   
dev-hugepages.mount           static  
dev-mqueue.mount              static  
proc-fs-nfsd.mount            static  
proc-sys-fs-binfmt_misc.mount static  
sys-fs-fuse-connections.mount static  
sys-kernel-config.mount       static  
sys-kernel-debug.mount        static  
tmp.mount                     disabled
var-lib-nfs-rpc_pipefs.mount  static  

9 unit files listed.

所以,这里没有关于 的内容boot。然而,有一个隐藏的(对我来说,到目前为止)boot.mount单位:

# journalctl -u boot.mount | filter-lines-by-hand
Jan 12 09:57:46 server systemd[1]: Unit boot.mount is bound to inactive unit dev-md-1.device. Stopping, too.
Jan 12 09:57:46 server systemd[1]: Unmounting /boot...
Jan 12 09:57:46 server umount[3069]: umount: /boot: target is busy.
Jan 12 09:57:46 server umount[3069]: (In some cases useful info about processes that use
Jan 12 09:57:46 server umount[3069]: the device is found by lsof(8) or fuser(1))
Jan 12 09:57:46 server systemd[1]: boot.mount mount process exited, code=exited status=32
Jan 12 09:57:46 server systemd[1]: Failed unmounting /boot.
Jan 12 09:57:46 server systemd[1]: Unit boot.mount is bound to inactive unit dev-md-1.device. Stopping, too.
Jan 12 09:57:46 server systemd[1]: Unmounting /boot...
Jan 12 09:57:46 server umount[3071]: umount: /boot: target is busy.
Jan 12 09:57:46 server umount[3071]: (In some cases useful info about processes that use
Jan 12 09:57:46 server umount[3071]: the device is found by lsof(8) or fuser(1))
Jan 12 09:57:46 server systemd[1]: boot.mount mount process exited, code=exited status=32
Jan 12 09:57:46 server systemd[1]: Failed unmounting /boot.
Jan 12 09:57:46 server systemd[1]: Unit boot.mount is bound to inactive unit dev-md-1.device. Stopping, too.
Jan 12 09:57:46 server systemd[1]: Unmounting /boot...
Jan 12 09:57:46 server umount[3073]: umount: /boot: target is busy.
Jan 12 09:57:46 server umount[3073]: (In some cases useful info about processes that use
Jan 12 09:57:46 server umount[3073]: the device is found by lsof(8) or fuser(1))
Jan 12 09:57:46 server systemd[1]: boot.mount mount process exited, code=exited status=32
Jan 12 09:57:46 server systemd[1]: Failed unmounting /boot.
Jan 12 09:57:46 server systemd[1]: Unit boot.mount is bound to inactive unit dev-md-1.device. Stopping, too.
Jan 12 09:57:46 server systemd[1]: Unmounting /boot...
Jan 12 09:57:46 server systemd[1]: Unmounted /boot.
Jan 12 09:57:46 server systemd[1]: Unit boot.mount entered failed state.

所以,是的,它正在悄悄地、烦人地systemd卸载我的/boot分区。想知道systemd如果没有那个安装,“认为”更新应该如何工作。

我可以复活它吗?

# systemctl restart boot.mount
Authorization not available. Check if polkit service is running or see debug message for more information.
(approx. 2 minute wait)
A dependency job for boot.mount failed. See 'journalctl -xe' for details.
Exit 1

# journalctl -u boot.mount | tail -2
Jan 12 10:07:26 server systemd[1]: Dependency failed for /boot.
Jan 12 10:07:26 server systemd[1]: Job boot.mount/start failed with result 'dependency'.

确实journalctl -xe有这样的消息

Unit dev-md-1.device has failed.

所以,剩下的唯一问题是,我怎样才能systemd知道 md1 实际上做得很好?这是一台服务器,所以我不想重新启动。

更新3:

尝试告诉 systemd md1 设备实际上非常好:

# systemctl reset-failed dev-md-1.device
Authorization not available. Check if polkit service is running or see debug message for more information.
# systemctl reset-failed \*
Authorization not available. Check if polkit service is running or see debug message for more information.
# mount -v /dev/md1 /boot ; ll /boot/ | wc ; sleep 0.1 ; ll /boot/ | wc
mount: /dev/md1 mounted on /boot.
     35     308    2811
      1       2       8

不成功。它再次陷入失败。

# systemctl start dev-md-1.device
Authorization not available. Check if polkit service is running or see debug message for more information.
(approx. 2 minute wait)
Job for dev-md-1.device timed out.
Exit 1
# journalctl -u boot.mount | tail -4
Jan 12 15:19:19 server systemd[1]: Unit boot.mount is bound to inactive unit dev-md-1.device. Stopping, too.
Jan 12 15:19:19 server systemd[1]: Unmounting /boot...
Jan 12 15:19:19 server systemd[1]: Unmounted /boot.
Jan 12 15:19:19 server systemd[1]: Unit boot.mount entered failed state.

解决方法:

# mount /dev/md1 /boot ; cd /boot ; nohup sleep 99000h < /dev/null > & /dev/null &

这将启动一个作为/bootCWD 的进程,以便由于设备正忙而systemd无法卸载。/boot但是,systemd每分钟在日志中创建 > 18000 行错误消息:

# journalctl -u boot.mount | fgrep 15:28: | awk '{ print $6 " " $7 " " $8 " " $9 " " $10 " " $11 " " $12 " " $13 " " $14 }' | sort | uniq --count
   1054 boot.mount mount process exited, code=exited status=32   
   4502 Failed unmounting /boot.      
   1792 (In some cases useful info about processes that use
   3279 the device is found by lsof(8) or fuser(1)) 
   5277 umount: /boot: target is busy.    
   1053 Unit boot.mount is bound to inactive unit dev-md-1.device. Stopping,
   1053 Unmounting /boot...       

这些是仅一分钟内相同行的计数!

答案1

要解决该问题:

  1. 列出系统上所有类型为“mount”的单元文件:

    systemctl 列表单元文件 -t 挂载

  2. 寻找类似“boot.mount”的内容

  3. 在日记中监控您的单位

    Journalctl -u boot.mount -f

  4. 尝试挂载文件系统

相关内容