从日志流中过滤日志输出块

从日志流中过滤日志输出块

这是我的任务:

我有一个来自消息传递过程的实时日志输出源流。许多输出与我无关,但有些部分我想单独收集和评估。这些块以“---BEGIN Request---”开头,位于以日期/时间、主机名和进程[pid]开头的单独行的末尾。相应地,一个块以另一行末尾的“---END Request---”结尾。这两者之间的是我想要捕获的内容。

我尝试使用 sed 处理日志摘录文件,但失败了。我尝试删除焦点之外的所有内容,但仍然得到了每一行。也许有人看到了我的错误:

sed -r '/---END Request---$/{
   $!{ N 
     s/---END Request---.?\n([^:]+: )---BEGIN Request---$/---END Request---\n\1---BEGIN Request---/
     t sub-hit
     :sub-miss
     P          
     D          
     :sub-hit
   }    
 }' sample.log

我认为 awk 可以作为这里使用的替代工具,但我还没有研究过它处理实时日志流的性能。

总有人能用 Python 或其他语言解决问题。我对此持开放态度,但我计划将其用于日志流,而不是静态文本文件。

这是我为测试而简化的日志样本摘录。我已匿名化并删除了一些内容。

Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Language: de-de
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: transport=http method=PUT status=200 proto=HTTP/2.0 host=10.17.17.240 user_agent=TokenHandler/3.2 path=/token/connect
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: POST /v3/token/033aaed70bdce765ace3223a5dc5 HTTP/1.1
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Authorization: Basic bWljcm9tZG06MjVuWjdWV3BjMkZaalRkZlRNVTNzaWdyS2xwZlRsVQ==
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: level=info component=tknzr method=add udid=033aaed70bdce765ace3223a5dc5 err=null took=145.419185ms
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: application/json; charset=utf-8
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: {
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]:   "status": "success",
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]:   "notification_id": "FC88CDE8-D3AD-4607-602F-6005E70E83E2"
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: }
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: transport=http method=POST status=200 proto=HTTP/1.1 host=10.17.17.230 user_agent= path=/v3/token/033aaed70bdce765ace3223a5dc5
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Language: en-US,en;q=0.9
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---

答案1

在流(实时日志)中,您可以使用 sed 选项-u (unbuffered)

您还可以|cut -c55-10000在命令末尾使用来剪切日期、主机名等。

相关内容