我有一个巨大的文件,其中包含一些进程的日志。日志中包含诸如“REQUEST(始终)/RESPONSE(有时)”之类的行,但 RESPONSE 不一定是 REQUEST 之后的下一行。 REQUEST 标头可能会在 RESPONSE 出现之前出现多次。我想加入请求和响应(如果存在),然后打印该行。这是我到目前为止所尝试的,但输出缺少一些行:
awk 'BEGIN {filename = "log1.etb"}
{line_num++; print "FNR: " FNR " NR: " NR " Counter: " line_num;
if ($0 ~ /REQUEST.*RPCLIB/)
{seqid = $0; sub(/^.*@SeqID/,"SeqID",seqid);
line_req = $0; line_resp = ""; ref_resp = 0;
ref_req = line_num; tot_req++;
print "REQUEST: " $0;
for(i=1;i<=line_num+99999;i++1) {getline < "log.etb"; if ($0 ~ /RESPONSE/ && $0 ~ seqid) {ref_resp = +i; line_resp = $0; break;}};
print "FNR: " FNR " NR: " NR " REQUEST: " ref_req " RESPONSE: " ref_resp " " seqid;
print line_req"+"line_resp > filename;
FNR = line_num-1; NR = FNR;
}
}
END {print "Total REQUEST: " tot_req}
' ../EXX/log.etb
输入:
REQUEST 2019-01-16 00:32:07.809@{fields}@SeqID = 9517
RESPONSE 2019-01-16 00:32:07.809@{fields}@SeqID = 9517 , Partner SeqID = 3393
REQUEST 2019-01-16 00:32:07.809@{fields}@SeqID = 9515
REQUEST 2019-01-16 00:32:07.810@{fields}@SeqID = 9520
RESPONSE 2019-01-16 00:32:07.810@{fields}@SeqID = 9520 , Partner SeqID = 3395
期望的输出:
REQUEST 2019-01-16 00:32:07.809@{fields}@SeqID = 9517+W02/RESPONSE 2019-01-16 00:32:07.809@{fields}@SeqID = 9517 , Partner SeqID = 3393
REQUEST 2019-01-16 00:32:07.809@{fields}@SeqID = 9515+
REQUEST 2019-01-16 00:32:07.810@{fields}@SeqID = 9520+W02/RESPONSE 2019-01-16 00:32:07.810@{fields}@SeqID = 9520 , Partner SeqID = 3395
SeqID 号连接请求/响应,但它可能在某个时刻在日志中重新出现。此外,REQUEST 可以在 RESPONSE 之前发生多次,并且 RESPONSE 可能发生也可能不发生。
答案1
我无法发表评论,所以我将评论作为答案发布,对此感到抱歉。
如果它们的 seqid 匹配,您想加入 REQUEST 和 RESPONSE,对吗?为什么不先按 seqid 对数据进行排序呢?它将确保响应始终遵循其请求。
答案2
我帮不了你,awk
但我制作了这个 Bash 脚本来完成这件事。
至少需要 Bash v4,但这应该是相当普遍的..
它需要来自 stdin 的输入,这意味着您需要按以下方式调用它:
cat logfile | script.sh
或者也:
script.sh < logfile
我这样做是故意的,认为这可能是可取的,但是很容易将文件名嵌入到脚本中,只需将其添加到命令中cat -n
的|
.
它处理:
- 缺少回复
- 重复请求,将响应加入到使用该 id 找到的最新请求
- 接受 @ 符号后的 SeqID 字段 (查看第一个
sed
命令中的正则表达式) - 使用 REQUEST 和 RESPONSE 文字正则表达式作为记录区分的标准(查看
if-elif-else-fi
代码块) - 连接匹配的 req/resp 对
+
华泰
#!/bin/bash
declare -A reqs=()
{
while IFS= read -r line ; do
read -r seqid rest <<<"${line}"
line="${line#${seqid} }"
if [[ "${line}" =~ REQUEST ]] ; then
[ "${reqs[$seqid]}" ] && printf '%s+\n' "${reqs[$seqid]}"
reqs[$seqid]="${line}"
elif [[ "${line}" =~ RESPONSE ]] && [ "${reqs[$seqid]}" ] ; then
printf '%s+%s\n' "${reqs[$seqid]}" "${line}"
unset reqs[$seqid]
else
printf 'strange record at line no. %s\n' "${line}" >&2
fi
done < <(cat -n | sed -e 's/\(.*@\)SeqID *= *\([0-9]\+\)\(.*\)/\2 &/') ;
printf '%s+\n' "${reqs[@]}" ;
} | sort -k 1 | sed -e 's/\(^\|+\)[[:blank:]]\+[0-9]\+[[:blank:]]\+/\1/g'
答案3
尝试了下面的方法,效果很好
for i in `cat k.txt| awk -F "=" '{print $2}'| awk -F "," '{print $1}'| sed -r "s/\s+//g"| sort| uniq`; do sed -n '/'$i'/p' k.txt|sed '1s/$/\+/g'| sed "N;s/\n/W02\//g";done
输出
for i in `cat k.txt| awk -F "=" '{print $2}'| awk -F "," '{print $1}'| sed -r "s/\s+//g"| sort| uniq`; do sed -n '/'$i'/p' k.txt|sed '1s/$/\+/g'| sed "N;s/\n/W02\//g";done
REQUEST 2019-01-16 00:32:07.809@{fields}@SeqID = 9515+
REQUEST 2019-01-16 00:32:07.809@{fields}@SeqID = 9517+W02/RESPONSE 2019-01-16 00:32:07.809@{fields}@SeqID = 9517 , Partner SeqID = 3393
REQUEST 2019-01-16 00:32:07.810@{fields}@SeqID = 9520+W02/RESPONSE 2019-01-16 00:32:07.810@{fields}@SeqID = 9520 , Partner SeqID = 3395
praveen@praveen:~$
答案4
我尝试了以下代码,但在大文件上花费的时间太长:
awk '{
if ($0 ~ /REQUEST/ && $0 ~ /RPCLIB/)
{seqid = $0; sub(/^.*@SeqID/,"SeqID",seqid);
line_req = $0; line_resp = "";
rng_s = NR; rng_e = NR + 99999;
cmd = "awk '\''/RESPONSE.*" seqid "/ && NR >= " rng_s " && NR <= " rng_e " {print $0;exit}'\'' logfile"
cmd | getline line_resp;
close(cmd);
print line_req"+"line_resp;
}
}
' logfile