这是我的任务:
我有一个来自消息传递过程的实时日志输出源流。许多输出与我无关,但有些部分我想单独收集和评估。这些块以“---BEGIN Request---”开头,位于以日期/时间、主机名和进程[pid]开头的单独行的末尾。相应地,一个块以另一行末尾的“---END Request---”结尾。这两者之间的是我想要捕获的内容。
我尝试使用 sed 处理日志摘录文件,但失败了。我尝试删除焦点之外的所有内容,但仍然得到了每一行。也许有人看到了我的错误:
sed -r '/---END Request---$/{
$!{ N
s/---END Request---.?\n([^:]+: )---BEGIN Request---$/---END Request---\n\1---BEGIN Request---/
t sub-hit
:sub-miss
P
D
:sub-hit
}
}' sample.log
我认为 awk 可以作为这里使用的替代工具,但我还没有研究过它处理实时日志流的性能。
总有人能用 Python 或其他语言解决问题。我对此持开放态度,但我计划将其用于日志流,而不是静态文本文件。
这是我为测试而简化的日志样本摘录。我已匿名化并删除了一些内容。
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Language: de-de
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: transport=http method=PUT status=200 proto=HTTP/2.0 host=10.17.17.240 user_agent=TokenHandler/3.2 path=/token/connect
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: POST /v3/token/033aaed70bdce765ace3223a5dc5 HTTP/1.1
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Authorization: Basic bWljcm9tZG06MjVuWjdWV3BjMkZaalRkZlRNVTNzaWdyS2xwZlRsVQ==
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: level=info component=tknzr method=add udid=033aaed70bdce765ace3223a5dc5 err=null took=145.419185ms
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: application/json; charset=utf-8
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: {
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: "status": "success",
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: "notification_id": "FC88CDE8-D3AD-4607-602F-6005E70E83E2"
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: }
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: transport=http method=POST status=200 proto=HTTP/1.1 host=10.17.17.230 user_agent= path=/v3/token/033aaed70bdce765ace3223a5dc5
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Language: en-US,en;q=0.9
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---
答案1
在流(实时日志)中,您可以使用 sed 选项-u (unbuffered)
您还可以|cut -c55-10000
在命令末尾使用来剪切日期、主机名等。