我需要匹配日志文件中的两个模式,并且需要获取匹配的模式(两个模式中)的下一行,最后需要在一行中打印这三个值。
示例日志文件:
2013/09/05 04:26:00 Processing Batch /fbc/dev/cebi/dod/9739867262
2013/09/05 04:26:02 Batch 9739867262 was successful
2013/09/05 04:26:02 Total Time = 3.13 Secs
2013/09/05 04:26:02 Repository API Time = 2.96 Secs
2013/09/05 04:26:02 File System Io Time = 0.06 Secs
2013/09/05 04:26:02 Doc Validation Time = 0.03 Secs
2013/09/05 04:26:02 Ending @ Thu Sep 05 04:26:02 EDT 2013
2013/09/05 08:18:10 Starting @ Thu Sep 05 08:18:10 EDT 2013
2013/09/05 08:18:10 Starting @ Thu Sep 05 08:18:10 EDT 2013
2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9844867675
2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9886743777
2013/09/05 08:18:16 Batch 9844867675 was successful
2013/09/05 08:18:16 Total Time = 6.00 Secs
2013/09/05 08:18:16 Repository API Time = 5.63 Secs
2013/09/05 08:18:16 File System Io Time = 0.05 Secs
2013/09/05 08:18:16 Doc Validation Time = 0.19 Secs
2013/09/05 08:18:16 Ending @ Thu Sep 05 08:18:16 EDT 2013
2013/09/05 08:18:18 Batch 9886743777 was successful
2013/09/05 08:18:18 Total Time = 8.27 Secs
2013/09/05 08:18:18 Repository API Time = 8.52 Secs
2013/09/05 08:18:18 File System Io Time = 0.08 Secs
2013/09/05 08:18:18 Doc Validation Time = 0.47 Secs
2013/09/05 08:18:18 Ending @ Thu Sep 05 08:18:18 EDT 2013
我将数字分别放在名为 cust_no.txt 的文件中
9739867262
9844867675
9886743777
将这些数字作为输入,我需要匹配日志文件中的以下两个模式
- 处理批处理 /fbc/dev/cebi/dod/
- 批量成功
输出需要以下内容:
-> 在第一个模式 ( ) 的匹配上,i.e Processing Batch /fbc/dev/cebi/dod/<numbers in the cust_no.txt>
我需要获取第二个单词,即 $2 。 -> 在第二个模式 ( ) 的匹配上i.e Batch <numbers in the cust_no.txt> was successful
,我需要获取第二个单词,即 $2 -> 以及第二个模式之后的匹配后下一行的第 6 个单词 ($6)(即以 开头的行Total Time
)
期望的输出:
9739867262,04:26:00,04:26:02,3.13 Secs
9844867675,08:18:10,08:18:16,6.00 Secs
9886743777,08:18:10,08:18:18,8.27 Secs
为了得到这个,我尝试了以下方式,但这似乎不起作用:
awk -v cn=$cust_no '{{if ($0 ~ "Processing.*" cn) st=$2 && if ($0 ~ "Customer cn was successful" et=$2; getline; tt=$4} ; print st,et,tt}
答案1
这个怎么样:
while read number;do
start=$(grep "Processing Batch /fbc/dev/cebi/dod/$number" log_file\
|head -n 1|awk '{print $2}')
end=$(grep -A 1 "Batch $number was successful" log_file\
|head -n 2|tail -n 1|awk -v OFS=',' '{print $2,$6}')
echo "$number,$start,$end Secs"
done <cust_no.txt
答案2
如果您不介意使用 Perl 和 grep,这里有一个解决您问题的方法。这是脚本,名为cmd.pl
:
#!/usr/bin/env perl
use feature 'say';
#use Data::Dumper;
@file = `grep -f cust_no.txt -A 1 sample.log`;
my (%info, $secLineSeen, $time, $custno);
$secLineSeen = 0;
foreach my $line (@file) {
if ($secLineSeen == 1) {
#2013/09/05 08:18:18 Total Time = 8.27 Secs
(my $totTime) = ($line =~ m!\S+ \S+\s+Total Time\s+=\s+(\S+ Secs)!);
$info{$custno}{totTime} = $totTime;
$secLineSeen = 0;
} elsif ($line =~ m/Processing Batch/) {
#2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9844867675
($time, $custno) = ($line =~ m!\S+ (\S+)\s+Processing Batch.*/(\S+)!);
$info{$custno}{onetwo} = $time;
} elsif ($line =~ m/Batch.*successful/) {
#2013/09/05 08:18:18 Batch 9886743777 was successful
($time, $custno) = ($line =~ m!\S+ (\S+)\s+Batch (\S+) was.*!);
$info{$custno}{twotwo} = $time;
$secLineSeen = 1;
}
}
#print Dumper(\%info);
#9739867262,04:26:00,04:26:02,3.13 Secs
foreach my $key (sort keys %info) {
say "$key,$info{$key}{onetwo},$info{$key}{twotwo},$info{$key}{totTime}";
}
例子
$ ./cmd.pl
9739867262,04:26:00,04:26:02,3.13 Secs
9844867675,08:18:10,08:18:16,6.00 Secs
9886743777,08:18:10,08:18:18,8.27 Secs
细节
此 Perl 脚本首先创建一个数组 ,@file
其中包含此命令的结果:
$ grep -f cust_no.txt -A 1 sample.log
此命令获取日志文件 ,sample.log
并从文件 中选择包含客户编号的所有行cust_no.txt
,如下所示:
2013/09/05 04:26:00 Processing Batch /fbc/dev/cebi/dod/9739867262
2013/09/05 04:26:02 Batch 9739867262 was successful
2013/09/05 04:26:02 Total Time = 3.13 Secs
--
2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9844867675
2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9886743777
2013/09/05 08:18:16 Batch 9844867675 was successful
2013/09/05 08:18:16 Total Time = 6.00 Secs
--
2013/09/05 08:18:18 Batch 9886743777 was successful
2013/09/05 08:18:18 Total Time = 8.27 Secs
该grep
命令做了一件值得一提的额外特殊事情,主要是它在 ( -A 1
) 任何匹配项之后保留一行附加行。这使我们能够抓住其中包含“总时间”的行。
提取此数据后,Perl 脚本就会根据问题中提到的要求,使用多维哈希来存储此输出中关键数据片段的结果。
当我们完成处理内容后,哈希看起来像这样@file
:
$VAR1 = {
'9739867262' => {
'twotwo' => '04:26:02',
'totTime' => '3.13 Secs',
'onetwo' => '04:26:00'
},
'9886743777' => {
'twotwo' => '08:18:18',
'totTime' => '8.27 Secs',
'onetwo' => '08:18:10'
},
'9844867675' => {
'twotwo' => '08:18:16',
'totTime' => '6.00 Secs',
'onetwo' => '08:18:10'
}
};
最后,我们循环遍历这个哈希并以问题中指定的格式打印我们收集的内容。
答案3
我会尝试 grep :
grep -EA 1 'pattern1|pattern2' file.log
使用选项 -E 扩展正则表达式,使用 -A 选项指定匹配后的行数。现在,为了将其打印在一行上,我可以想到一种使用 sed 的非常黑客的方法:
grep -EA 1 'pattern1|pattern2' file.log | grep -v ^-- | sed 'N ; s+\n+|+g'
将命令N
(读取下一行)传递给 sed 允许您一次处理两行输入。另一方面,该命令s+\n+|+g
允许您替换(使用您选择的分隔符)或删除(如果替换为空)正在处理的两行之间的换行符,只保留第二行末尾的新行。
grep -v ^--
我有必要删除--
第一个 grep 实例的输出(请参见下面的说明性示例)。
Line 1
Line 2
--
Line X
Line Y