我正在尝试获取一个脚本,将 CoreOS 上的系统日志推送到日志条目。为了弥补在 AWS 上启动实例时无法立即连接互联网这一事实,我将命令放在了 while 循环中。
从命令行运行脚本时,while 循环运行正常。但是当 systemd 运行该脚本时,如果 netcat 超时,它会立即退出,因此它永远没有机会重试。
有没有办法让 systemd 不那么积极地退出脚本?
systemd 输出永远不会到达“休眠的 netcat”
Jul 23 22:26:21 core-01 systemd[1]: Starting Push journal logs to logentries.com...
Jul 23 22:26:21 core-01 systemd[1]: Started Push journal logs to logentries.com.
Jul 23 22:26:21 core-01 bash[880]: trying netcat
Jul 23 22:26:31 core-01 bash[880]: Ncat: Connection timed out.
journal2logentries.sh
#!/usr/bin/env bash
token=logentriestoken
while true
do
echo 'trying netcat'
journalctl -o short -f | awk -v token=$token '{ print token, $0; fflush(); }' | ncat --ssl --ssl-verify data.logentries.com 20000
echo 'sleeping netcat'
sleep 30s
done
登录入口服务
[Unit]
Description=Push journal logs to logentries.com
After=systemd-journald.service
After=systemd-networkd.service
[Service]
Restart=always
ExecStart=/bin/bash /home/core/journal2logentries.sh
[Install]
WantedBy=multi-user.target
更新:
真正的问题似乎是,当 netcat 死机时,systemd 会发现 /bin/sh 进程仍在运行。注意:url 故意不正确,以方便测试
logentries.service - Push journal logs to logentries.com
Loaded: loaded (/etc/systemd/system/logentries.service; disabled)
Active: active (running) since Mon 2014-07-28 17:12:04 UTC; 1min 48s ago
Main PID: 16305 (sh)
CGroup: /system.slice/logentries.service
├─16305 /bin/sh -c journalctl -o short -f | awk -v token=token_here '{ print token, $0; fflush(); }' | ncat --ssl --ssl-verify -vv ogentries.com 20000
├─16306 journalctl -o short -f
└─16307 awk -v token=80b4b3b6-1315-4b76-ac69-f530c1dec47f { print token, $0; fflush(); }
Jul 28 17:12:04 ip-172-31-19-155.us-west-2.compute.internal systemd[1]: logentries.service holdoff time over, scheduling restart.
Jul 28 17:12:04 ip-172-31-19-155.us-west-2.compute.internal systemd[1]: Stopping Push journal logs to logentries.com...
Jul 28 17:12:04 ip-172-31-19-155.us-west-2.compute.internal systemd[1]: Starting Push journal logs to logentries.com...
Jul 28 17:12:04 ip-172-31-19-155.us-west-2.compute.internal systemd[1]: Started Push journal logs to logentries.com.
Jul 28 17:12:04 ip-172-31-19-155.us-west-2.compute.internal sh[16305]: Ncat: Version 6.40 ( http://nmap.org/ncat )
Jul 28 17:12:04 ip-172-31-19-155.us-west-2.compute.internal sh[16305]: Ncat: Could not resolve hostname "ogentries.com": Name or service not known. QUITTING.
答案1
从管道转换为流程替代。
http://paraf.in/abs-guide/process-sub.html
https://stackoverflow.com/a/18360260/136408
这是我提出的单元文件:
登录入口服务
[Unit]
Description=Push journal logs to logentries.com
After=systemd-journald.service
After=systemd-networkd.service
[Service]
Restart=always
RestartSec=30s
ExecStart=/bin/bash -c "ncat --ssl --ssl-verify data.logentries.com 20000 < <(awk -v token=token_here '{ print token, $0; fflush(); }' < <(journalctl -o short -f))"
[Install]
WantedBy=multi-user.target
答案2
您是否尝试过 || /bin/true
在命令上使用返回零退出状态来防止 systemd 检测到错误的退出状态?
journalctl -o short -f | awk -v token=$token '{ print token, $0; fflush(); }' | ncat --ssl --ssl-verify data.logentries.com 20000 || /bin/true