从日志中提取

从日志中提取

我想要从以下原始日志中 grep 以下信息:

2016-05-23 11:01:40 [1005583] 1b4ivg-004DZf-GX ** [email protected] F=<abbas@DomainName> P=<abbas@DomainName> R=dkim_lookuphost T=dkim_remote_smtp H=mx2.hotmail.com [65.54.188.72]:25 I=[IP Address]:56910 X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=yes DN="/CN=*.hotmail.com": SMTP error from remote mail server after MAIL FROM:<abbas@DomainName> SIZE=275286: 421 RP-001 (BAY004-MC1F14) Unfortunately, messages from 16.23.21.111 weren't sent. Please contact your Internet service provider since part of their network is on our block list. You can also refer your provider to http://mail.live.com/mail/troubleshooting.aspx#errors.
2016-05-23 11:12:53 [1015989] 1b4j6h-004GIq-Ob ** [email protected] F=<corporate-kbl@DomainName> P=<corporate-kbl@DomainName> R=lookuphost T=remote_smtp H=mx3.hotmail.com [65.55.37.120]:25 I=[IP Address]:51605 X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=yes DN="/CN=*.hotmail.com": SMTP error from remote mail server after MAIL FROM:<corporate-kbl@DomainName> SIZE=17484: 550 SC-001 (COL004-MC4F44) Unfortunately, messages from 16.23.21.111 weren't sent. Please contact your Internet service provider since part of their network is on our block list. You can also refer your provider to http://mail.live.com/mail/troubleshooting.aspx#errors.
2016-05-23 11:13:19 [1020551] 1b4j76-004HUH-Nr ** [email protected] (muhammad.yousuf@DomainName) <muhammad.yousuf@DomainName> F=<saeed.ahmed@DomainName> P=<saeed.ahmed@DomainName> R=dkim_lookuphost T=dkim_remote_smtp H=mx3.hotmail.com [134.170.2.199]:25 I=[IP Address]:55971 X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=yes DN="/CN=*.hotmail.com": SMTP error from remote mail server after MAIL FROM:<saeed.ahmed@DomainName> SIZE=24006: 550 DY-001 (BLU004-MC1F21) Unfortunately, messages from 16.23.21.111 weren't sent. Please contact your Internet service provider since part of their network is on our block list. You can also refer your provider to http://mail.live.com/mail/troubleshooting.aspx#errors.

由于我有以下一组错误代码,如果它们发生错误字段显示错误,它们可能会发生:

421 RP-001
421 RP-002
421 RP-003
550 SC-001
550 SC-002
550 SC-003
550 SC-004
550 DY-001
550 DY-002
550 DY-001
550 OU-001
550 OU-002

我从以下命令获得了前三个字段的输出:

  echo "Timestamp            emailto:                  emailfrom:" && awk 'NF>6 { d=6 ; while ( ! ($d ~ /^F=/ ) ) d++ ; printf "%s\t%s\t%s\n",$1,$6,substr($d,4,length($d)-4) ;} ' logs | column -t

我想要得到什么:

  Timestamp:                    Email To:               Email From:            Messages From:       Error Codes:
 2016-05-23                [email protected]       abbas@DomainName          16.23.21.111         421 RP-001
 2016-05-23                [email protected]       corporate-kbl@DomainName    16.23.21.111         550 SC-001
 2016-05-23                [email protected]      saeed.ahmed@DomainName      16.23.21.111         550 DY-001  

答案1

您不会使用 grep,您可以使用 awk,但我更喜欢使用 sed 的正则表达式。

# <logs sed -nE 's,^([-0-9]{10})[^@]* ([^@]*@[^[:space:]]*)[^=]*F=<([^@]*@[^[:space:]]*)>.*SIZE=[^[:space:]]* (... ..-...) .*([[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+).*,\1 \2 \3 \5 \4,p'

看起来很吓人,但是它将括号中的位作为组(\1 \2 等)捕获,因此第一个是日期(10 位数字或 - ),然后跳到下一个 @ 符号([^@] 表示任何不匹配 @ 的内容),对电子邮件地址进行分组,跳到下一个 =,回到 F,对 F 地址进行分组,然后跳到 SIZE,然后抓取错误代码(任意三个、空格、任意两个、连字符、任意三个,然后跳到 IP 地址(读者练习)。'p' 命令使 sed 打印任何替换的行。

它没有做太多验证,例如 9.9.99.999 对它来说是一个有效的 IP,但这超出了任务范围。

这有帮助吗?

您可以在最后一部分放置制表符代替空格以便对齐。

相关内容