过去几天我一直在谷歌上搜索,寻找一个关于如何用正则表达式表达日志条目以获取所需数据的可靠示例,然后将其插入到数据库中,但显然我的谷歌功夫不够好。
我想要做的是跟踪电子邮件的发送时间,然后跟踪远程 mta 响应,特别是 dsn 代码。此时,我为每种情况设置了两个模板:
# /etc/rsyslog.conf
...
$Template tpl_custom_header, "MPurcell: CUSTOM HEADER Template: %msg%\n"
$Template tpl_response_dsn, "MPurcell: RESPONSE DSN Template: %msg%\n"
# /etc/rsyslog.d/mail
if $programname == 'mail-myapp' then /var/log/mail/myapp.log
if ($programname == 'mail-myapp') and ($msg contains 'X-custom_header') then /var/log/mail/test.log;tpl_custom_header
if ($programname == 'mail-myapp') and ($msg contains 'dsn=') then /var/log/mail/test.log;tpl_response_dsn
& ~
日志条目示例:
MPurcell: CUSTOM HEADER Template: D921940A1A: prepend: header X-custom_header: 101 from localhost[127.0.0.1]; from=<[email protected]> to=<[email protected]> proto=ESMTP helo=<localhost>: headername: message-id
MPurcell: RESPONSE DSN Template: D921940A1A: to=<[email protected]>, relay=gmail-smtp-in.l.google.com[2607:f8b0:400e:c02::1a]:25, delay=2, delays=0.12/0.01/0.82/1.1, dsn=2.0.0, status=sent (250 2.0.0 OK 1372378600 o4si2828280pac.279 - gsmtp)
从自定义标题模板中我想提取:D921940A1A 和 X-custom_header 值;101
我想要从 RESPONSE DSN 模板中提取:D921940A1A 和“dsn=2.0.0”
答案1
如果有人最终面临同样的情况,以下是我最终做的事情:
# /etc/rsyslog.conf
# Not sure what R signifies but saw it in other examples
# ERE = extended regex
# 0 = The submatch we want
# DFLT = How should a non match be returned?
$Template tpl_custom_header, "%msg:R,ERE,0,DFLT:[^:]+--end% | %msg:R,ERE,2,DFLT:X-custom_header:( )([0-9]*)--end%\n"
$Template tpl_response_dsn, "%msg:R,ERE,0,DFLT:[^:]+--end% | %msg:R,ERE,1,DFLT:dsn=([0-9][.][0-9][.][0-9])--end% \n"
要测试你的正则表达式,你应该使用:http://www.rsyslog.com/regex/,虽然有点俗气,但还是完成了工作。
原始日志条目示例,与 OP 略有不同:
Jun 29 05:40:28 service1 mail-myapp/cleanup[22200]: 6F67240A1A: prepend: header X-custom_header: 136 from localhost[127.0.0.1]; from=<[email protected]> to=<[email protected]> proto=ESMTP helo=<localhost>: headername: message-id
Jun 29 05:40:30 service1 mail-myapp/smtp[22201]: 6F67240A1A: to=<[email protected]>, relay=gmail-smtp-in.l.google.com[2607:f8b0:400e:c01::1a]:25, delay=2, delays=0.09/0/0.82/1, dsn=2.0.0, status=sent (250 2.0.0 OK 1372485254 rs6si5760686pbc.32 - gsmtp)
应用模板后它们的样子:
6F67240A1A | 136
6F67240A1A | 2.0.0
当我插入 mysql 时,我将把 dsn 作为 int 与字符串插入以获得更好的性能,因此考虑使用这个:
insert into response_log_dsn set mail_id = '6F67240A1A', dsn = (select cast(replace('2.0.0', '.', '') as unsigned));