这个问题是如何在多行上 grep 多个模式?
这是示例文本,其中包含“reqId: regexpat”或“reqCompleted: regexpat”的行应成对匹配,其中“regexpat”是唯一的,实际上它可能是一个 UUID。
2016-09-27 GET /some/uri - reqId: 000-pat1-bgr, more text
2016-09-27 GET /some/uri - reqId: 0.215487, your favourite song
2016-09-27 irrelevant message
2016-09-27 ignored record
2016-09-27 reqCompleted: 999-xxx-vvv, ignore this
2016-09-27 reqCompleted: 0.215487, more characters
2016-09-27 reqCompleted: 000-pat1-bgr, more characters
2016-09-27 another lost message
预期结果应该是
2016-09-27 GET /some/uri - reqId: 000-pat1-bgr, more text
2016-09-27 GET /some/uri - reqId: 0.215487, your favourite song
2016-09-27 reqCompleted: 0.215487, more characters
2016-09-27 reqCompleted: 000-pat1-bgr, more characters
000-pat1-bgr 和 0.215487 是唯一标识符。我尝试过使用 perl-regex 支持的 grep
grep --null-data --only-matching --perl-regex '(?s)^\N+ RequestId:\1, \N+$\n(?:.*)^\N+ reqCompleted: ([a-z0-9\.-]+), .\N+$\n'
但这就是我得到的
2016-09-27 GET /some/uri - reqId: 000-pat1-bgr, more text
2016-09-27 GET /some/uri - reqId: 0.215487, your favourite song
2016-09-27 irrelevant message
2016-09-27 ignored record
2016-09-27 reqCompleted: 999-xxx-vvv, ignore this
2016-09-27 reqCompleted: 0.215487, more characters
2016-09-27 reqCompleted: 000-pat1-bgr, more characters
是否可以使用一行 grep 命令来实现这一点?
答案1
由于使用了 grep,另一种保持行顺序的 awk 方法是
awk -F"[:,]" '/reqCompleted/ || /reqId/{
dupsIDs[$(NF-1)]++
}END{
for (x in dupsIDs)
if (dupsIDs[x]==2) print x
}' infile |grep -f - infile
2016-09-27 GET /some/uri - reqId: 000-pat1-bgr, more text
2016-09-27 GET /some/uri - reqId: 0.215487, your favourite song
2016-09-27 reqCompleted: 0.215487, more characters
2016-09-27 reqCompleted: 000-pat1-bgr, more characters
答案2
使用awk
,找到解决方案:
awk 'BEGIN{FS=":|,";i=0;}
/reqCompleted/ || /reqId/{
arr[$i]=$2;
lines[$i]=$0;
i++;
}END{
for(key in arr){
if(arr[key] in arr){
}else{
print lines[key]
};
}
}' input.txt
预期输出如下:
2016-09-27 reqCompleted: 000-pat1-bgr, more characters
2016-09-27 GET /some/uri - reqId: 0.215487, your favourite song
2016-09-27 reqCompleted: 0.215487, more characters
2016-09-27 GET /some/uri - reqId: 000-pat1-bgr, more text