问题
如何将 systemd-nspawn 容器作为 systemd 单元进行管理?
设置
安装镜像
# machinectl pull-raw --verify=no https://ftp.halifax.rwth-aachen.de/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.raw.xz
发现图像名称
# machinectl list-images
NAME TYPE RO USAGE CREATED MODIF
Fedora-Cloud-Base-30-1.2.x86_64 raw no 891.6M Fri 2019-04-26 02:14:49 UTC Fri 2
1 images listed.
在 Fedora 容器内启动交互式 shell
# systemd-nspawn -M Fedora-Cloud-Base-30-1.2.x86_64
编写/root/app.py处理信号的python3应用程序
# https://stackabuse.com/handling-unix-signals-in-python/
import signal
import os
import time
import sys
def terminateProcess(signalNumber, frame):
print(f'received signal {signalNumber}')
print ('exiting...')
sys.exit()
def receiveSignal(signalNumber, frame):
print(f'received signal {signalNumber}')
return
if __name__ == '__main__':
# register the signals to be caught
signal.signal(signal.SIGHUP, receiveSignal)
signal.signal(signal.SIGINT, terminateProcess)
signal.signal(signal.SIGQUIT, receiveSignal)
signal.signal(signal.SIGILL, receiveSignal)
signal.signal(signal.SIGTRAP, receiveSignal)
signal.signal(signal.SIGABRT, receiveSignal)
signal.signal(signal.SIGBUS, receiveSignal)
signal.signal(signal.SIGFPE, receiveSignal)
#signal.signal(signal.SIGKILL, receiveSignal)
signal.signal(signal.SIGUSR1, receiveSignal)
signal.signal(signal.SIGSEGV, receiveSignal)
signal.signal(signal.SIGUSR2, receiveSignal)
signal.signal(signal.SIGPIPE, receiveSignal)
signal.signal(signal.SIGALRM, receiveSignal)
signal.signal(signal.SIGTERM, terminateProcess)
# output current process id
print(f'pid {os.getpid()}')
# wait in an endless loop for signals
while True:
time.sleep(1)
使用组合键退出容器
Control + ]]]
应用程序服务尝试1
写入 /etc/systemd/system/app.service 单元文件
[Service]
ExecStart=/usr/bin/systemd-nspawn --keep-unit -M Fedora-Cloud-Base-30-1.2.x86_64 python3 -u /root/app.py
SyslogIdentifier=%N
--keep-unit
switch 将 systemd-nspawn 和 app.py 保留在system.slice/app.service
cgroup中-u
无缓冲输出开关- SyslogIdentifier 使用
%N
字符串“app”的说明符,这是不带后缀的单元名称
重新加载systemd守护进程
# systemctl daemon-reload
在另一个终端中,使用systemd-journald持续跟踪日志输出
# journalctl -f -u app.service
启动app.service单元
# systemctl start app.service
停止app.service单元
# systemctl stop app.service
观察日志
-- Logs begin at Fri 2019-08-23 16:58:11 UTC. --
Aug 23 17:26:42 srv0 systemd[1]: Started app.service.
Aug 23 17:26:42 srv0 app[12745]: Spawning container Fedora-Cloud-Base-30-1.2.x86_64 on /var/lib/machines/Fedora-Cloud-Base-30-1.2.x86_64.raw.
Aug 23 17:26:42 srv0 app[12745]: Press ^] three times within 1s to kill container.
Aug 23 17:26:42 srv0 app[12745]: Failed to create directory /tmp/nspawn-root-afZQoJ/sys/fs/selinux: Read-only file system
Aug 23 17:26:42 srv0 app[12745]: Failed to create directory /tmp/nspawn-root-afZQoJ/sys/fs/selinux: Read-only file system
Aug 23 17:26:42 srv0 app[12745]: pid 1
Aug 23 17:26:54 srv0 systemd[1]: Stopping app.service...
Aug 23 17:26:54 srv0 app[12745]: Container Fedora-Cloud-Base-30-1.2.x86_64 terminated by signal KILL.
Aug 23 17:26:54 srv0 systemd[1]: app.service: Main process exited, code=exited, status=1/FAILURE
Aug 23 17:26:54 srv0 systemd[1]: Stopped app.service.
Aug 23 17:26:54 srv0 systemd[1]: app.service: Unit entered failed state.
Aug 23 17:26:54 srv0 systemd[1]: app.service: Failed with result 'exit-code'.
Aug 23 17:26:54 srv0 systemd[1]: Stopped app.service.
systemd-nspawn 使用 SIGKILL 而不是 SIGTERM。
Aug 23 17:26:54 srv0 app[12745]: Container Fedora-Cloud-Base-30-1.2.x86_64 terminated by signal KILL.
请参阅“I don’t Want to SIGKILL app.py, I Want to SIGTERM it”这一行。
应用程序服务尝试2
阅读 github 问题
https://github.com/systemd/systemd/issues/7105#issuecomment-467491778
使用-a/--as-pid2
开关
[Service]
ExecStart=/usr/bin/systemd-nspawn --keep-unit --as-pid2 -M Fedora-Cloud-Base-30-1.2.x86_64 python3 -u /root/app.py
SyslogIdentifier=%N
守护进程重新加载、启动、停止
观察日志
Aug 23 17:29:59 srv0 systemd[1]: Started app.service.
Aug 23 17:29:59 srv0 app[12841]: Spawning container Fedora-Cloud-Base-30-1.2.x86_64 on /var/lib/machines/Fedora-Cloud-Base-30-1.2.x86_64.raw.
Aug 23 17:29:59 srv0 app[12841]: Press ^] three times within 1s to kill container.
Aug 23 17:29:59 srv0 app[12841]: Failed to create directory /tmp/nspawn-root-jaGbcx/sys/fs/selinux: Read-only file system
Aug 23 17:29:59 srv0 app[12841]: Failed to create directory /tmp/nspawn-root-jaGbcx/sys/fs/selinux: Read-only file system
Aug 23 17:29:59 srv0 app[12841]: pid 2
Aug 23 17:30:06 srv0 systemd[1]: Stopping app.service...
Aug 23 17:30:06 srv0 app[12841]: Container Fedora-Cloud-Base-30-1.2.x86_64 terminated by signal KILL.
Aug 23 17:30:06 srv0 systemd[1]: app.service: Main process exited, code=exited, status=1/FAILURE
Aug 23 17:30:06 srv0 systemd[1]: Stopped app.service.
Aug 23 17:30:06 srv0 systemd[1]: app.service: Unit entered failed state.
Aug 23 17:30:06 srv0 systemd[1]: app.service: Failed with result 'exit-code'.
app.py 现在以 pid 2 运行!但仍然收到 SIGKILL 信号而不是 SIGTERM。
答案1
阅读 github 问题更多
https://github.com/systemd/systemd/issues/7105#issuecomment-467491778
使用--kill-signal
旗帜
[Service]
ExecStart=/usr/bin/systemd-nspawn --keep-unit --kill-signal=SIGTERM -M Fedora-Cloud-Base-30-1.2.x86_64 python3 -u /root/app.py
SyslogIdentifier=%N
重新加载systemd,启动app.service,并停止app.service
systemctl daemon-reload
systemctl start app.service
systemctl stop app.service
观察日志
Aug 23 17:51:32 srv0 systemd[1]: Started app.service.
Aug 23 17:51:32 srv0 app[12994]: Spawning container Fedora-Cloud-Base-30-1.2.x86_64 on /var/lib/machines/Fedora-Cloud-Base-30-1.2.x86_64.raw.
Aug 23 17:51:32 srv0 app[12994]: Press ^] three times within 1s to kill container.
Aug 23 17:51:32 srv0 app[12994]: Failed to create directory /tmp/nspawn-root-71uVxm/sys/fs/selinux: Read-only file system
Aug 23 17:51:32 srv0 app[12994]: Failed to create directory /tmp/nspawn-root-71uVxm/sys/fs/selinux: Read-only file system
Aug 23 17:51:32 srv0 app[12994]: pid 1
Aug 23 17:51:35 srv0 app[12994]: Trying to halt container. Send SIGTERM again to trigger immediate termination.
Aug 23 17:51:35 srv0 app[12994]: received signal 15
Aug 23 17:51:35 srv0 app[12994]: exiting...
Aug 23 17:51:35 srv0 systemd[1]: Stopping app.service...
Aug 23 17:51:35 srv0 app[12994]: Container Fedora-Cloud-Base-30-1.2.x86_64 exited successfully.
Aug 23 17:51:35 srv0 systemd[1]: Stopped app.service.
Aug 23 17:51:35 srv0 systemd[1]: Stopped app.service.
看看 SIGTERM 是如何传递到 app.py 的!