我在两台台式机上运行 kubuntu 22.04。我已通过 NFS 3(默认的 fstab)挂载了部分用户主目录。几个月前,我开始在关机和重启期间遇到问题:卸载 NFS 共享时,该过程几乎恰好挂起一分钟(“停止作业正在运行...”)。奇怪的是,我可以在关机前从用户会话中卸载 NFS 共享,没有任何延迟。
我尝试调试这个问题,这对我来说本身就是一个挑战,但最终发现挂起似乎发生在 umount2() 调用本身。请参阅 strace 摘录
16:22:29 umount2("/mnt/homes", 0) = 0
16:23:30 newfstatat(AT_FDCWD, "/run/mount/utab", {st_mode=S_IFREG|0644, st_size=1014, ...}, 0) = 0
这似乎是一些奇怪的 systemd 作业依赖关系,因为将以下服务插入启动顺序 [编辑 5] 可以避免大多数时候的挂起 [/编辑 5] 至少在重启时如此,但不能避免关机:
[Unit]
Description=umount NFS shares first thing at shutdown
After=multi-user.target
Requires=remote-fs.target
Requires=network-online.target
[Service]
Type=idle
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/usr/bin/umount -a -t nfs
[Install]
WantedBy=shutdown.target
[编辑] 我应该提到,只有当挂载的文件系统被实际访问时才会发生挂起。仅挂载而无访问不会导致挂起。
systemctl show mnt-homes.mount 的输出
Where=/mnt/homes
What=storage:/nfs/home
Options=rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.2.10,mountvers=3,mountport=856,mountproto=udp,local_lock=none,addr=192.168.2.10
Type=nfs
TimeoutUSec=1min 30s
ControlPID=0
DirectoryMode=0755
SloppyOptions=no
LazyUnmount=no
ForceUnmount=no
ReadWriteOnly=no
Result=success
UID=[not set]
GID=[not set]
Slice=system.slice
ControlGroup=/system.slice/mnt-homes.mount
MemoryCurrent=163840
MemoryAvailable=infinity
CPUUsageNSec=9059000
TasksCurrent=0
IPIngressBytes=[no data]
IPIngressPackets=[no data]
IPEgressBytes=[no data]
IPEgressPackets=[no data]
IOReadBytes=18446744073709551615
IOReadOperations=18446744073709551615
IOWriteBytes=18446744073709551615
IOWriteOperations=18446744073709551615
Delegate=no
CPUAccounting=yes
CPUWeight=[not set]
StartupCPUWeight=[not set]
CPUShares=[not set]
StartupCPUShares=[not set]
CPUQuotaPerSecUSec=infinity
CPUQuotaPeriodUSec=infinity
IOAccounting=no
IOWeight=[not set]
StartupIOWeight=[not set]
BlockIOAccounting=no
BlockIOWeight=[not set]
StartupBlockIOWeight=[not set]
MemoryAccounting=yes
DefaultMemoryLow=0
DefaultMemoryMin=0
MemoryMin=0
MemoryLow=0
MemoryHigh=infinity
MemoryMax=infinity
MemorySwapMax=infinity
MemoryLimit=infinity
DevicePolicy=auto
TasksAccounting=yes
TasksMax=17999
IPAccounting=no
ManagedOOMSwap=auto
ManagedOOMMemoryPressure=auto
ManagedOOMMemoryPressureLimit=0
ManagedOOMPreference=none
UMask=0022
LimitCPU=infinity
LimitCPUSoft=infinity
LimitFSIZE=infinity
LimitFSIZESoft=infinity
LimitDATA=infinity
LimitDATASoft=infinity
LimitSTACK=infinity
LimitSTACKSoft=8388608
LimitCORE=infinity
LimitCORESoft=0
LimitRSS=infinity
LimitRSSSoft=infinity
LimitNOFILE=524288
LimitNOFILESoft=1024
LimitAS=infinity
LimitASSoft=infinity
LimitNPROC=59998
LimitNPROCSoft=59998
LimitMEMLOCK=65536
LimitMEMLOCKSoft=65536
LimitLOCKS=infinity
LimitLOCKSSoft=infinity
LimitSIGPENDING=59998
LimitSIGPENDINGSoft=59998
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=infinity
LimitRTTIMESoft=infinity
OOMScoreAdjust=0
CoredumpFilter=0x33
Nice=0
IOSchedulingClass=2
IOSchedulingPriority=4
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
CPUAffinityFromNUMA=no
NUMAPolicy=n/a
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
LogLevelMax=-1
LogRateLimitIntervalUSec=0
LogRateLimitBurst=0
SecureBits=0
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin cap_net_raw cap_ipc_lock cap_ipc_owner cap_sys_module cap_sys_rawio cap_sys_chroot cap_sys_ptrace cap_sys_pacct cap_sys_admin cap_sys_boot cap_sys_nice cap_sys_resource cap_sys_time cap_sys_tty_config cap_mknod cap_lease cap_audit_write cap_audit_control cap_setfcap cap_mac_override cap_mac_admin cap_syslog cap_wake_alarm cap_block_suspend cap_audit_read cap_perfmon cap_bpf cap_checkpoint_restore
DynamicUser=no
RemoveIPC=no
PrivateTmp=no
PrivateDevices=no
ProtectClock=no
ProtectKernelTunables=no
ProtectKernelModules=no
ProtectKernelLogs=no
ProtectControlGroups=no
PrivateNetwork=no
PrivateUsers=no
PrivateMounts=no
PrivateIPC=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=yes
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=2147483646
LockPersonality=no
RuntimeDirectoryPreserve=no
RuntimeDirectoryMode=0755
StateDirectoryMode=0755
CacheDirectoryMode=0755
LogsDirectoryMode=0755
ConfigurationDirectoryMode=0755
TimeoutCleanUSec=infinity
MemoryDenyWriteExecute=no
RestrictRealtime=no
RestrictSUIDSGID=no
RestrictNamespaces=no
MountAPIVFS=no
KeyringMode=shared
ProtectProc=default
ProcSubset=all
ProtectHostname=no
KillMode=control-group
KillSignal=15
RestartKillSignal=15
FinalKillSignal=9
SendSIGKILL=yes
SendSIGHUP=no
WatchdogSignal=6
Id=mnt-homes.mount
Names=mnt-homes.mount
Requires=system.slice -.mount
Wants=network-online.target
RequiredBy=remote-fs.target
Conflicts=umount.target
Before=umount.target remote-fs.target
After=remote-fs-pre.target network.target network-online.target systemd-journald.socket -.mount system.slice
RequiresMountsFor=/mnt
Documentation="man:fstab(5)" "man:systemd-fstab-generator(8)"
Description=/mnt/homes
LoadState=loaded
ActiveState=active
FreezerState=running
SubState=mounted
FragmentPath=/run/systemd/generator/mnt-homes.mount
SourcePath=/etc/fstab
UnitFileState=generated
UnitFilePreset=enabled
StateChangeTimestamp=Wed 2024-02-21 11:40:25 CET
StateChangeTimestampMonotonic=10080675
InactiveExitTimestamp=Wed 2024-02-21 11:40:24 CET
InactiveExitTimestampMonotonic=9881776
ActiveEnterTimestamp=Wed 2024-02-21 11:40:25 CET
ActiveEnterTimestampMonotonic=10080675
ActiveExitTimestamp=n/a
ActiveExitTimestampMonotonic=0
InactiveEnterTimestamp=n/a
InactiveEnterTimestampMonotonic=0
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
CanFreeze=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnSuccessJobMode=fail
OnFailureJobMode=replace
IgnoreOnIsolate=yes
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobRunningTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Wed 2024-02-21 11:40:24 CET
ConditionTimestampMonotonic=9880739
AssertTimestamp=Wed 2024-02-21 11:40:24 CET
AssertTimestampMonotonic=9880739
Transient=no
Perpetual=no
StartLimitIntervalUSec=10s
StartLimitBurst=5
StartLimitAction=none
FailureAction=none
SuccessAction=none
InvocationID=2ac4fcae15e44482b6b9b47d7e72e006
CollectMode=inactive
[编辑 2] 根据 systemd 的说法,卸载时网络仍处于运行状态。请参阅 systemd 调试输出摘录:
[ 251.503803] systemd[1]: mnt-homes.mount: Changed mounted -> unmounting
[ 251.503808] systemd[1]: Unmounting /mnt/homes...
[ 251.503892] systemd-journald[476]: Successfully sent stream file descriptor to service manager.
[ 251.503940] systemd[3628]: mnt-homes.mount: Executing: /bin/umount /mnt/homes -c
...
[ 313.206159] systemd[1]: Received SIGCHLD from PID 3628 (umount).
[ 313.206172] systemd[1]: Child 3628 (umount) died (code=exited, status=0/SUCCESS)
[ 313.206197] systemd[1]: mnt-homes.mount: Child 3628 belongs to mnt-homes.mount.
[ 313.206203] systemd[1]: mnt-homes.mount: Mount process exited, code=exited, status=0/SUCCESS (success)
[ 313.206206] systemd[1]: mnt-homes.mount: Deactivated successfully.
[ 313.206232] systemd[1]: mnt-homes.mount: Changed unmounting -> dead
[ 313.206344] systemd[1]: mnt-homes.mount: Job 1748 mnt-homes.mount/stop finished, result=done
[ 313.206347] systemd[1]: Unmounted /mnt/homes.
[ 313.206376] systemd[1]: mnt-homes.mount: Consumed 18ms CPU time.
[ 313.206459] systemd[1]: systemd-journald.service: Received EPOLLHUP on stored fd 22 (stored), closing.
[ 313.206475] systemd[1]: network-online.target changed active -> dead
[ 313.206477] systemd[1]: network-online.target: Job 1843 network-online.target/stop finished, result=done
[ 313.206479] systemd[1]: Stopped target Network is Online.
[ 313.206494] systemd[1]: network.target changed active -> dead
[ 313.206496] systemd[1]: network.target: Job 1769 network.target/stop finished, result=done
[ 313.206498] systemd[1]: Stopped target Network.
[ 313.206509] systemd[1]: network-pre.target: stopping held back, waiting for: [email protected]
[ 313.206512] systemd[1]: remote-fs-pre.target changed active -> dead
[ 313.206514] systemd[1]: remote-fs-pre.target: Job 1886 remote-fs-pre.target/stop finished, result=done
[ 313.206515] systemd[1]: Stopped target Preparation for Remote File Systems.
[ 313.206535] systemd[1]: nfs-client.target changed active -> dead
[ 313.206537] systemd[1]: nfs-client.target: Job 1768 nfs-client.target/stop finished, result=done
[ 313.206539] systemd[1]: Stopped target NFS client services.
[ 313.206549] systemd[1]: shutdown.target: starting held back, waiting for: snapd.mounts.target
[ 313.206551] systemd[1]: umount.target: starting held back, waiting for: run-snapd-ns.mount
[ 313.206765] systemd[1]: [email protected]: About to execute /sbin/ifdown eno1
[ 313.206956] systemd[1]: [email protected]: Forked /sbin/ifdown as 3649
[ 313.207014] systemd[1]: [email protected]: Changed exited -> stop
[ 313.207018] systemd[1]: Stopping ifup for eno1...
[ 313.207315] systemd[3649]: [email protected]: Executing: /sbin/ifdown eno1
[ 313.207315] systemd[1]: Failed to read pids.max attribute of cgroup root, ignoring: No data available
[ 313.207381] systemd[1]: networking.service: About to execute /sbin/ifdown -a --read-environment --exclude=lo
[ 313.207501] systemd[1]: networking.service: Forked /sbin/ifdown as 3650
[ 313.207557] systemd[1]: networking.service: Changed exited -> stop
[ 313.207562] systemd[1]: Stopping Raise network interfaces...
...
[ 313.207806] systemd[3650]: networking.service: Executing: /sbin/ifdown -a --read-environment --exclude=lo
[编辑 3] 这是基于有线的网络,并且在 umount 之前和之后链接保持不变。ip a 显示(valid_lft 和 preferred_lft 不同):
1: lo: ...
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether ...
altname enp6s0
inet 192.168.2.117/24 brd 192.168.2.255 scope global dynamic eno1
valid_lft 851799sec preferred_lft 851799sec
[编辑 4] 通过 -f 或 ForceUnmount=yes 强制卸载没有帮助。
有什么想法可以解决这个问题或者进一步分析吗?
答案1
没有真正的补救措施,但至少可以缓解症状:在 /etc/systemd/system/mnt-homes.mount.d/override.conf 中放入一个插件
[Mount]
TimeoutSec=15