使用正则表达式在日志中的时间戳之间添加换行符

使用正则表达式在日志中的时间戳之间添加换行符

我有一些日志数据:

2017-12-03 01:35:58 [Notice] syslog: local  IP address 
2017-12-03 01:35:58 [Notice] syslog: remote IP address 
2017-12-03 01:35:58 [Notice] syslog: primary   DNS address 
2017-12-03 01:35:58 [Notice] syslog: secondary DNS address 
2017-12-03 01:35:59 [Warning] kernel: Link State: PVC_8_0 logistic interface up.
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: started, version 2.52 cachesize 150
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
2017-12-03 01:35:59 [Warning] dnsmasq[10463]: ignoring nameserver 127.0.0.1 - local interface
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: using nameserver 87.216.1.66#53(via ppp80)
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: using nameserver 87.216.1.65#53(via ppp80)
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: read /etc/hosts - 6 addresses
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: started, version 2.52 cachesize 150
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: using nameserver 87.216.1.66#53(via ppp80)
2017-12-03 01:36:01 [Warning] kernel: ^M
2017-12-03 01:36:01 [Warning] kernel: Send DNS Query : domain=ntp2.jazztel.com qType=A dnsServer=87.216.1.65
2017-12-03 01:36:01 [Warning] kernel: domain: ntp2.jazztel.com , IP: 87.216.1.241
2017-12-03 01:36:01 [Warning] kernel: sntp server=ntp2.jazztel.com: 0x5 ntpServerIP=87.216.1.241

我想在每次时间戳改变时添加一个换行符,因此它看起来像这样:

2017-12-03 01:35:58 [Notice] syslog: local  IP address 
2017-12-03 01:35:58 [Notice] syslog: remote IP address 
2017-12-03 01:35:58 [Notice] syslog: primary   DNS address 
2017-12-03 01:35:58 [Notice] syslog: secondary DNS address 

2017-12-03 01:35:59 [Warning] kernel: Link State: PVC_8_0 logistic interface up.
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: started, version 2.52 cachesize 150
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
2017-12-03 01:35:59 [Warning] dnsmasq[10463]: ignoring nameserver 127.0.0.1 - local interface
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: using nameserver 87.216.1.66#53(via ppp80)
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: using nameserver 87.216.1.65#53(via ppp80)
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: read /etc/hosts - 6 addresses

2017-12-03 01:36:00 [Informational] dnsmasq[10532]: started, version 2.52 cachesize 150
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: using nameserver 87.216.1.66#53(via ppp80)

2017-12-03 01:36:01 [Warning] kernel: ^M
2017-12-03 01:36:01 [Warning] kernel: Send DNS Query : domain=ntp2.jazztel.com qType=A dnsServer=87.216.1.65
2017-12-03 01:36:01 [Warning] kernel: domain: ntp2.jazztel.com , IP: 87.216.1.241
2017-12-03 01:36:01 [Warning] kernel: sntp server=ntp2.jazztel.com: 0x5 ntpServerIP=87.216.1.241

这适用于https://regexr.com

s/(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} ).*\n(?!\1)/$0\n/g

但是当我在终端(OSX)中尝试它时它什么也没做:

curl -s http://192.168.1.1/cgi-bin/status_log2.cgi | grep 2017 | tail -n 30 | perl -pe 's/(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} ).*\n(?!\1)/$0\n/g'

我也尝试过gsed,但sed没有成功。

(如果有办法完全删除所有冗余时间戳,那就更好了!)

答案1

在每个新时间前添加换行符:

awk '!a[$1,$2]++ && NR>1{print ""} 1'

怎么运行的:在 awk 中,$1$2是第一个字段,在本例中是日期和时间。 a[$1,$2]是一个关联数组,用于计算这两个字段出现过的次数。如果我们之前见过这个日期和时间,!a[$1,$2]并且不在第一行,NR>1那么我们会打印一个空行进行分隔,print ""。最后一个1只是 print-the-current-line 的简写。

例子

将示例日志保存在文件中logfile

$ awk '!a[$1,$2]++ && NR>1{print ""} 1' logfile
2017-12-03 01:35:58 [Notice] syslog: local  IP address 
2017-12-03 01:35:58 [Notice] syslog: remote IP address 
2017-12-03 01:35:58 [Notice] syslog: primary   DNS address 
2017-12-03 01:35:58 [Notice] syslog: secondary DNS address 

2017-12-03 01:35:59 [Warning] kernel: Link State: PVC_8_0 logistic interface up.
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: started, version 2.52 cachesize 150
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
2017-12-03 01:35:59 [Warning] dnsmasq[10463]: ignoring nameserver 127.0.0.1 - local interface
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: using nameserver 87.216.1.66#53(via ppp80)
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: using nameserver 87.216.1.65#53(via ppp80)
2017-12-03 01:35:59 [Informational] dnsmasq[10463]: read /etc/hosts - 6 addresses

2017-12-03 01:36:00 [Informational] dnsmasq[10532]: started, version 2.52 cachesize 150
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
2017-12-03 01:36:00 [Informational] dnsmasq[10532]: using nameserver 87.216.1.66#53(via ppp80)

2017-12-03 01:36:01 [Warning] kernel: ^M
2017-12-03 01:36:01 [Warning] kernel: Send DNS Query : domain=ntp2.jazztel.com qType=A dnsServer=87.216.1.65
2017-12-03 01:36:01 [Warning] kernel: domain: ntp2.jazztel.com , IP: 87.216.1.241
2017-12-03 01:36:01 [Warning] kernel: sntp server=ntp2.jazztel.com: 0x5 ntpServerIP=87.216.1.241

删除重复的时间戳

$ awk '{if(a[$1,$2]++){gsub(/./," ",$1); gsub(/./," ",$2)} else if (NR>1) print""} 1' logfile
2017-12-03 01:35:58 [Notice] syslog: local  IP address 
                    [Notice] syslog: remote IP address
                    [Notice] syslog: primary DNS address
                    [Notice] syslog: secondary DNS address

2017-12-03 01:35:59 [Warning] kernel: Link State: PVC_8_0 logistic interface up.
                    [Informational] dnsmasq[10463]: started, version 2.52 cachesize 150
                    [Informational] dnsmasq[10463]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
                    [Warning] dnsmasq[10463]: ignoring nameserver 127.0.0.1 - local interface
                    [Informational] dnsmasq[10463]: using nameserver 87.216.1.66#53(via ppp80)
                    [Informational] dnsmasq[10463]: using nameserver 87.216.1.65#53(via ppp80)
                    [Informational] dnsmasq[10463]: read /etc/hosts - 6 addresses

2017-12-03 01:36:00 [Informational] dnsmasq[10532]: started, version 2.52 cachesize 150
                    [Informational] dnsmasq[10532]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP
                    [Informational] dnsmasq[10532]: using nameserver 87.216.1.66#53(via ppp80)

2017-12-03 01:36:01 [Warning] kernel: ^M
                    [Warning] kernel: Send DNS Query : domain=ntp2.jazztel.com qType=A dnsServer=87.216.1.65
                    [Warning] kernel: domain: ntp2.jazztel.com , IP: 87.216.1.241
                    [Warning] kernel: sntp server=ntp2.jazztel.com: 0x5 ntpServerIP=87.216.1.241

在这种情况下,如果我们之前见过日期$1和时间,$2那么我们用空白替换它们的内容。gsub(/./," ",$1); gsub(/./," ",$2)如果没有,并且我们不在第一行,那么我们会打印一个空白行进行分隔。

答案2

您需要告诉 Perl 一次加载所有内容,因为它通常逐行读取输入。-0777为此使用。

此外,$0在 Perl 中是脚本的名称(-e一行代码)。捕获整行并将其引用为$1,用于\2日期:

perl -0777 -pe 's/((\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} ).*\n)(?!\2)/$1\n/g'

答案3

我想发布一个 (GNU)sed 解决方案。

sed -nr 'h;s/^([^[]+) \[.*/\1/;x;p;:a;g;N;s/^([^\n]+)\n\1(.*)/\1\2/;Tb;:c;s/[0-9:-]([0-9 :-]+\[)/ \1/;tc;p;ba;:b;s/^[^\n]+//;P;D' logfile
2017-12-03 01:35:58 [Notice] syslog: local  IP address                                                                                      
                    [Notice] syslog: remote IP address                                                                                      
                    [Notice] syslog: primary   DNS address                                                                                  
                    [Notice] syslog: secondary DNS address                                                                                  

2017-12-03 01:35:59 [Warning] kernel: Link State: PVC_8_0 logistic interface up.                                                            
                    [Informational] dnsmasq[10463]: started, version 2.52 cachesize 150                                                     
                    [Informational] dnsmasq[10463]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP         
                    [Warning] dnsmasq[10463]: ignoring nameserver 127.0.0.1 - local interface                                               
                    [Informational] dnsmasq[10463]: using nameserver 87.216.1.66#53(via ppp80)                                              
                    [Informational] dnsmasq[10463]: using nameserver 87.216.1.65#53(via ppp80)                                              
                    [Informational] dnsmasq[10463]: read /etc/hosts - 6 addresses                                                           

2017-12-03 01:36:00 [Informational] dnsmasq[10532]: started, version 2.52 cachesize 150                                                     
                    [Informational] dnsmasq[10532]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N no-DHCP no-TFTP         
                    [Informational] dnsmasq[10532]: using nameserver 87.216.1.66#53(via ppp80)                                              

2017-12-03 01:36:01 [Warning] kernel: ^M                                                                                                    
                    [Warning] kernel: Send DNS Query : domain=ntp2.jazztel.com qType=A dnsServer=87.216.1.65
                    [Warning] kernel: domain: ntp2.jazztel.com , IP: 87.216.1.241
                    [Warning] kernel: sntp server=ntp2.jazztel.com: 0x5 ntpServerIP=87.216.1.241

相关内容