在 Ubuntu 16.04.1 上运行:
rsyslogd 8.16.0, compiled with:
PLATFORM: x86_64-pc-linux-gnu
PLATFORM (lsb_release -d):
FEATURE_REGEXP: Yes
GSSAPI Kerberos 5 support: Yes
FEATURE_DEBUG (debug build, slow code): No
32bit Atomic operations supported: Yes
64bit Atomic operations supported: Yes
memory allocator: system default
Runtime Instrumentation (slow code): No
uuid support: Yes
Number of Bits in RainerScript integers: 64
和:
#rsyslogd -N1
rsyslogd: version 8.16.0, config validation run (level 1), master config /etc/rsyslog.conf
rsyslogd: command 'KLogPermitNonKernelFacility' is currently not permitted - did you already set it via a RainerScript command (v6+ config)? [v8.16.0 try http://www.rsyslog.com/e/2222 ]
如果以下配置生效,服务停止时会挂起:
$ModLoad imfile
##########
# MySQL slow query log (standard log is autosent to syslog):
$InputFileName /var/log/mysql/mysql-slow.log
$InputFileTag mysqld_slow:
$InputFileStateFile state-mysqld_slow
$InputFileSeverity info
$InputRunFileMonitor
##########
# Apache access:
###
$InputFileName /var/log/virtualmin/mysite.com_access_log
$InputFileTag www-access-mysite.com:
$InputFileStateFile state-www-access-mysite.com
$InputFileSeverity info
$InputRunFileMonitor
##########
# Apache Error:
###
$InputFileName /var/log/virtualmin/mysite.com_error_log
$InputFileTag www-errors-mysite.org:
$InputFileStateFile state-www-errors-mysite.com
$InputFileSeverity error
$InputRunFileMonitor
### Main Error Log
$InputFileName /var/log/apache2/error.log
$InputFileTag www-errors-main:
$InputFileStateFile state-www-errors
$InputFileSeverity error
$InputRunFileMonitor
# Setup disk assisted queues
$ActionQueueFileName fwdRule1 # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList # run asynchronously
$ActionResumeRetryCount -1 # infinite retries if host is down
#RsyslogGnuTLS
$DefaultNetstreamDriverCAFile /etc/rsyslog.d/keys/ca.d/certs.crt
###
###First BLOCK
###
template(name="access" type="string" string="<%pri%>%protocol-version% %timestamp:::date-rfc3339% Host1 %app-name% %procid% %msgid% [id@40000 tag=\"access\"] %msg%\n")
template(name="errors" type="string" string="<%pri%>%protocol-version% %timestamp:::date-rfc3339% Host1 %app-name% %procid% %msgid% [id@40000 tag=\"errors\"] %msg%\n")
# Send messages over TCP using the template.
:syslogtag, startswith, "www-access" action(type="omfwd" protocol="tcp" target="logs.com" port="6000" template="access" StreamDriver="gtls" StreamDriverMode="1" StreamDriverAuthMode="x509/name" StreamDriverPermittedPeers="*.logs.com")
:syslogtag, startswith, "www-error" action(type="omfwd" protocol="tcp" target="logs.com" port="6000" template="errors" StreamDriver="gtls" StreamDriverMode="1" StreamDriverAuthMode="x509/name" StreamDriverPermittedPeers="*.logs.com")
###
###Second BLOCK
###
template(name="standard" type="string" string="<%pri%>%protocol-version% %timestamp:::date-rfc3339% Host1 %app-name% %procid% %msgid% %msg%\n")
# Send messages over TCP using the template.
:syslogtag, startswith, "www-access" action(type="omfwd" protocol="tcp" target="logs.other.com" port="38549" template="standard" StreamDriver="gtls" StreamDriverMode="1" StreamDriverAuthMode="x509/name" StreamDriverPermittedPeers="*.other.com")
:syslogtag, startswith, "www-errors" action(type="omfwd" protocol="tcp" target="logs.other.com" port="38549" template="standard" StreamDriver="gtls" StreamDriverMode="1" StreamDriverAuthMode="x509/name" StreamDriverPermittedPeers="*.other.com")
:syslogtag, startswith, "mysqld" action(type="omfwd" protocol="tcp" target="logs.other.com" port="38549" template="standard" StreamDriver="gtls" StreamDriverMode="1" StreamDriverAuthMode="x509/name" StreamDriverPermittedPeers="*.other.com")
如果通过运行rsyslogd -dn
你可以看到:
9744.300605710:main thread : main Q: EnqueueMsg advised worker start
9744.300724026:main thread : Terminating main queue... 9744.300734161:main thread : main Q: initiating worker thread shutdown sequence
9744.300739666:main thread : main Q: trying shutdown of regular workers
9744.300766918:main thread : sent SIGTTIN to worker thread 0x9b57a700
9744.300776037:main Q:Reg/w0 : wti 0x16756a0: worker awoke from idle processing
9744.300782296:main Q:Reg/w0 : DeleteProcessedBatch: we deleted 0 objects and enqueued 0 objects
9744.300786960:main Q:Reg/w0 : doDeleteBatch: delete batch from store, new sizes: log 0, phys 0
9744.300791987:main Q:Reg/w0 : regular consumer finished, iret=4, szlog 0 sz phys 0
9744.300797344:main Q:Reg/w0 : wti 0x16756a0: terminating worker terminateRet=5, bInactivityTOOccured=0
9744.300803075:main Q:Reg/w0 : DDDD: wti 0x16756a0: worker cleanup action instances
9744.300807699:main Q:Reg/w0 : wti 0x16756a0, action 0, ptr 0x7f1394004210
9744.300825960:main Q:Reg/w0 : wti 0x16756a0, action 1, ptr 0x7f1394038f80
9744.300835440:main Q:Reg/w0 : wti 0x16756a0, action 2, ptr 0x7f13940022a0
9744.300875784:main Q:Reg/w0 : wti 0x16756a0, action 3, ptr 0x7f1394004470
9744.300885088:main Q:Reg/w0 : wti 0x16756a0, action 4, ptr (nil)
9744.300890048:main Q:Reg/w0 : wti 0x16756a0, action 5, ptr (nil)
9744.300894505:main Q:Reg/w0 : wti 0x16756a0, action 6, ptr (nil)
9744.300898883:main Q:Reg/w0 : wti 0x16756a0, action 7, ptr (nil)
9744.300903665:main Q:Reg/w0 : wti 0x16756a0, action 8, ptr (nil)
9744.300908515:main Q:Reg/w0 : wti 0x16756a0, action 9, ptr (nil)
9744.300912910:main Q:Reg/w0 : wti 0x16756a0, action 10, ptr 0x7f13940023f0
9744.300918720:main Q:Reg/w0 : wti 0x16756a0, action 11, ptr 0x7f13940068f0
9744.300941837:main Q:Reg/w0 : wti 0x16756a0, action 12, ptr 0x7f13940504b0
9744.300951981:main Q:Reg/w0 : wti 0x16756a0, action 13, ptr 0x7f1394462e10
9744.301055305:main thread : main Q:Reg: waiting 1499ms on worker thread termination, 1 still running
9745.800919600:main thread : main Q:Reg: timeout waiting on worker thread termination
9745.800976996:main thread : sent SIGTTIN to worker thread 0x9b57a700
9745.801024045:main thread : main Q: regular shutdown timed out on primary queue (this is OK)
9745.801030518:main thread : main Q: checking to see if we need to cancel any worker threads of the primary queue
9745.801035688:main thread : sent SIGTTIN to worker thread 0x9b57a700, giving it a chance to terminate
9745.811204331:main thread : cooperative worker termination failed, using cancellation...
9745.811246502:main thread : wti 0x16756a0: canceling worker thread
我已经尝试在防火墙关闭的情况下启动/停止,按照支持论坛中的此主题(我没有得到任何帮助)
strace 的尾部显示
6700 14:13:02 write(6, "Oct 6 14:13:02 hostname rsyslogd: [origin software=\"rsyslogd\" swVersion=\"8.16.0\" x-pid=\"6696\" x-info=\"http://www.rsyslog.com\"] exiting on signal 15.\n", 146) = 146
6700 14:13:02 futex(0x1a3516c, FUTEX_WAIT_PRIVATE, 99, NULL <unfinished ...>
6696 14:13:02 <... futex resumed> ) = 1
6696 14:13:02 futex(0x1a3516c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x1a35168, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
6700 14:13:02 <... futex resumed> ) = 0
6700 14:13:02 futex(0x1a34f80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
6696 14:13:02 <... futex resumed> ) = 1
6696 14:13:02 tgkill(6696, 6700, SIGTTIN <unfinished ...>
6700 14:13:02 <... futex resumed> ) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
6700 14:13:02 --- SIGTTIN {si_signo=SIGTTIN, si_code=SI_TKILL, si_pid=6696, si_uid=104} ---
6700 14:13:02 rt_sigreturn({mask=~[KILL STOP TTIN RTMIN RT_1]}) = -1 EINTR (Interrupted system call)
6700 14:13:02 futex(0x1a34f80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
6696 14:13:02 <... tgkill resumed> ) = 0
6696 14:13:02 futex(0x1a34f80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
6700 14:13:02 <... futex resumed> ) = 0
6700 14:13:02 futex(0x1a34f80, FUTEX_WAKE_PRIVATE, 1) = 0
6700 14:13:02 writev(14, [{"\25\3\3\0\32\0\0\0\0\0\0\1qv\373\273}\275\276\252\207\\~\7\343p\237[\343'\363", 31}], 1) = 31
6700 14:13:02 recvfrom(14, <unfinished ...>
6696 14:13:02 <... futex resumed> ) = 1
6696 14:13:02 futex(0x1a35014, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1475763183, 729652779}, ffffffff <unfinished ...>
6700 14:13:02 <... recvfrom resumed> "\25\3\3\0\32", 5, 0, NULL, NULL) = 5
6700 14:13:02 mprotect(0x7fd3145d8000, 16384, PROT_READ|PROT_WRITE) = 0
6700 14:13:02 recvfrom(14, "w_\345'\310\3473\3\\q\32 \37\276\237A\270a\260\200\24\34p\262G\277", 26, 0, NULL, NULL) = 26
6700 14:13:02 close(14) = 0
6700 14:13:02 writev(25, [{"\25\3\3\0\32\0\0\0\0\0\0\1\20\3535\fh\5\324bZ\211U\203AX.\37\205`m", 31}], 1) = 31
6700 14:13:02 recvfrom(25, "\25\3\3\0\32", 5, 0, NULL, NULL) = 5
6700 14:13:02 recvfrom(25, "\307y\343\27z#\31\373\232\333\371\263\275R\203\31\332?)>\201\245\301z\335;", 26, 0, NULL, NULL) = 26
6700 14:13:02 close(25) = 0
6700 14:13:02 writev(24, [{"\25\3\3\0000\26f\310\217\23\247d?\4\3\301\252\277+\207\250q[9L\306\274\2414}\306\237\253X\221\35\27:\376\206\203\36PP\177\363\362I\321\223\372\203W", 53}], 1) = 53
6700 14:13:02 recvfrom(24, <unfinished ...>
6696 14:13:03 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
6696 14:13:03 tgkill(6696, 6700, SIGTTIN) = 0
6700 14:13:03 <... recvfrom resumed> 0x7fd31449c17b, 5, 0, NULL, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
6700 14:13:03 --- SIGTTIN {si_signo=SIGTTIN, si_code=SI_TKILL, si_pid=6696, si_uid=104} ---
6700 14:13:03 rt_sigreturn({mask=~[KILL STOP TTIN RTMIN RT_1]}) = -1 EINTR (Interrupted system call)
6700 14:13:03 recvfrom(24, <unfinished ...>
6696 14:13:03 futex(0x1a34fe8, FUTEX_WAKE_PRIVATE, 1) = 0
6696 14:13:03 tgkill(6696, 6700, SIGTTIN) = 0
6700 14:13:03 <... recvfrom resumed> 0x7fd31449c17b, 5, 0, NULL, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
6700 14:13:03 --- SIGTTIN {si_signo=SIGTTIN, si_code=SI_TKILL, si_pid=6696, si_uid=104} ---
6700 14:13:03 rt_sigreturn({mask=~[KILL STOP TTIN RTMIN RT_1]}) = -1 EINTR (Interrupted system call)
6700 14:13:03 recvfrom(24, <unfinished ...>
6696 14:13:03 select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout.)
6696 14:13:03 select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout.)
6696 14:13:03 select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout.)
6696 14:13:03 select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout.)
.
.
.
and it repeats until
6696 14:14:32 select(0, NULL, NULL, NULL, 0x7ffe5e77b200 <ptrace(SYSCALL):No such process>
6700 14:14:32 +++ killed by SIGKILL +++
6696 14:14:32 +++ killed by SIGKILL +++
答案1
事实证明其中一个日志服务的证书存在问题。有趣的是,它在启动或运行期间根本没有抱怨证书,只有在退出时才会抱怨。
确切的问题是以下一个或多个:
- 与另一服务的证书位于同一文件中
- 将其重命名为 .CRT 文件(原本是 .PEM 文件)。
- 它不喜欢此文件中相关证书的顺序
因此,我不确定是否可以在同一个文件中拥有多个证书,尽管在这个答案中说。
总而言之,如果您要运送到多个服务,我建议使用多个配置文件。