正如标题所说,我有/var
并 /var/log
位于不同的分区上。
关机时,出现错误umount /var/log
,然后umount /var
失败。
我的问题是:
如何调试该问题?
如果算的话,我正在运行 Debian Stretch。
到目前为止,我一直在网上搜索,结果发现了最后一刻journald
登录时出现的问题,但是,在我的系统上,日志转到。/var/log
journald
/run
这意味着,有其他东西占用了/var
。
1) 理想情况下,我会在出现 umount 错误时停止关机过程,打开 shell,然后发出lsof
,或者只是在其中放置一个执行相同操作的脚本。但是,我知识不够,我该怎么做?
我有一个模糊的想法,我应该有一个init.d
脚本,不需要local_fs
,并将它放入rc0
并rc6
与中K99
,然后它将在正确的时间执行,并在日志文件中写入一些输出。
或者也许 rc 级别没有那么精细的控制,我应该创建一个脚本和一个systemd
单元来运行它。
无论如何,这里的问题是,即使我尝试过,我也不知道它是否在正确的时间执行,所以我不知道我在日志中看到的内容是来自错误发生之前、之后还是现场……?
/var/log
2)或者,我可以检查在正常的 rc2 运行系统上写入的内容lsof
,然后找到它们的所有启动脚本/方法,并确保它们有/var
和/var/log
已安装的要求。
另外确保我不会创建关机依赖循环。
我宁愿先找出问题所在,然后再盲目地覆盖我的系统配置。
A)那么,这有点劫持了问题,但也许有一个设置,可能就是/etc/fstab
这样说的:“对于卸载顺序,将/var
其视为/var/log
相同/
”。
答案1
我的解决方案是轮询。我以 root 身份登录/root
,这是最后要卸载的东西。此版本的脚本尊重“停止”,但它可以继续。
检查输出,似乎分区已正确卸载,但相对于其他进程,确切的时间略有变化。
因此,错误/警告信息似乎是无害的。
这是我用来轮询的脚本,安装说明在注释中。将脚本命名为“oflogger”。
#! /bin/sh
### BEGIN INIT INFO
# Provides: oflogger
# Required-Start:
# Required-Stop:
# Default-Start: 2 3 4 5
# Default-Stop:
# Short-Description: log the open fd-s in selected dirs (/var, /var/log)
# Description: Log the output of lsof and mount, filtered by dir into the
# root's home every 0.1s .
# The root's home is not the safest place, but we want to log
# as long as we can, so the location must be at the root
# partition. We don't want to litter with this logfile,
# so the ~root seems to be a nice, out-of-way place, which
# will probably also ring a bell when backuping the system.
# NOTE: We want this to be absolutely the last thing to be
# killed (and the first one to be started), so even
# though it obviously needs a filesystem, we do not
# add this requirement.
# This is because the task of this program is exactly
# to identify processes that may obstruct umounting.
### END INIT INFO
# INSTALL:
#cp oflogger /etc/init.d/
#ln -s /etc/init.d/oflogger /etc/rc0.d/K99oflogger
#ln -s /etc/init.d/oflogger /etc/rc1.d/K99oflogger
#ln -s /etc/init.d/oflogger /etc/rc2.d/S01oflogger
#ln -s /etc/init.d/oflogger /etc/rc3.d/S01oflogger
#ln -s /etc/init.d/oflogger /etc/rc4.d/S01oflogger
#ln -s /etc/init.d/oflogger /etc/rc5.d/S01oflogger
#ln -s /etc/init.d/oflogger /etc/rc6.d/K99oflogger
# not adding to rcS.d
#cp /usr/bin/cut /usr/bin/uniq /usr/bin/lsof /usr/bin/sort /root/
# unINSTALL:
#rm /etc/init.d/oflogger
#rm /etc/rc0.d/K99oflogger
#rm /etc/rc1.d/K99oflogger
#rm /etc/rc2.d/S01oflogger
#rm /etc/rc3.d/S01oflogger
#rm /etc/rc4.d/S01oflogger
#rm /etc/rc5.d/S01oflogger
#rm /etc/rc6.d/K99oflogger
#rm /root/cut /root/uniq /root/lsof /root/sort
LSOF='/root/lsof'
GREP='/bin/grep'
CUT='/root/cut'
SORT='/root/sort'
UNIQ='/root/uniq'
test -x "$LSOF" || exit 0
. /lib/lsb/init-functions
pid=''
case "$1" in
start)
echo "===start" >> /root/lsof.log
# NOTE: Error output from here will end up in the system log,
# and since lsof produces an error message every time
# it runs, we rather disable it.
# ALTERNATIVE:
# just filtering that 1 offending message, but since
# we know the script works, we just ignore this problem
while sleep 0.1
do
echo '.../var'
$LSOF | $GREP '/var' | $CUT -d' ' -f1 | $SORT | $UNIQ
echo '.../var/log'
$LSOF | $GREP '/var/log' | $CUT -d' ' -f1 | $SORT | $UNIQ
echo '...mount'
mount | grep '\<var\>'
echo '---------------------'
done 2>/dev/null 1>>/root/lsof.log &
ps=$!
;;
restart)
echo "===restart" >> /root/lsof.log
;;
force-reload)
echo "===force-reload" >> /root/lsof.log
;;
reload)
echo "===reload" >> /root/lsof.log
;;
stop)
echo "===stop" >> /root/lsof.log
if [ x != x"$ps" ]
then
kill $ps
fi
;;
status)
echo "===status" >> /root/lsof.log
;;
*)
echo "===*" >> /root/lsof.log
;;
esac
exit 0