抓取具有多个匹配条件但匹配条件不在同一行的日志文件块

抓取具有多个匹配条件但匹配条件不在同一行的日志文件块

代码:

grep -rI --exclude=*_*.log -B20 -A1 "Status:   Error" > /var/log/alertError.log

输入日志示例:

[06/07/20 20:38:53.911]:loopback ST:                  token-src-name() 

[06/07/20 20:38:53.914]:loopback ST:                    Token Value: "DVADER". 

[06/07/20 20:38:53.916]:loopback ST:                  token-text(",OU=users,O=data") 

[06/07/20 20:38:53.919]:loopback ST:    Arg Value: "CN=DVADER,OU=users,O=data". 

[06/07/20 20:38:53.922]:loopback ST:                description("Removed by Termination Process") 

[06/07/20 20:38:53.926]:loopback ST:             token-text("Removed by Termination Process") 

[06/07/20 20:38:53.929]:loopback ST:                  Arg Value: "Removed by Termination Process". 

[06/07/20 20:38:53.943]:loopback ST: DirXML Log Event -------------------

     Driver:   \StarWars\system\Driver Set\User Processor

     Channel:  Subscriber

     Status:   Error

     Message:  Code(-9217) Error in

此输入日志显示了我想要抓取的日志的一部分。不过,我想匹配我想要搜索日志的日期。我使用 grep 因为我可以对结构中的所有日志文件进行递归搜索。我的 grep 返回我想要的所有数据,但我现在想排除较旧的代码块。所以日期并不在所有行上。 grep 返回一段带有我需要的开关 B 和 A 的代码。因此,如果您看到整行,则消息和状态行与正在进行的多个操作一致。我可以使用此 grep 命令获取所有类型的消息或状态值,但我只是不知道如何将所有结果消除为仅针对给定日期范围的代码块。

答案1

这是python解决方案:

with open("log.txt") as f:                                  # open file log.txt
    lines = f.readlines()                                   # load lines
    nonempty = filter(lambda x: x.strip() != "", lines)     # filter out empty lines
    newlines = []                                           # list for out result lines
    for l in nonempty:                                      # iterate lines
        l = l.rstrip()                                      # cut '\n' from the right of each line
        last_idx = len(newlines) - 1                        # index of the last element in the list
        if l.startswith(" "):                               # this lines are lines with your traceback
            newlines[last_idx] += l                         # add to the "normal" log element
        else:                                               # this are "normal" elements
            newlines.append(l)                              # add them to the list

    print("\n".join(newlines))                              # create output and print to stdout

此输出在同一行包含“状态”和“日期”,您grep可以

将其(例如normalize.py)放在您的日志文件(例如log.txt)附近并运行python3 normalize.py

答案2

你的问题不清楚,但这就是你想要做的吗?

$ awk -v tgt='06/07/20' '
    /^\[/ { prt() }
    NF { rec = rec $0 ORS }
    END { prt() }

    function prt() {
        if ( index(rec,tgt) == 2 ) {
            printf "%s", rec
        }
        rec = ""
    }
' file
[06/07/20 20:38:53.911]:loopback ST:                  token-src-name()
[06/07/20 20:38:53.914]:loopback ST:                    Token Value: "DVADER".
[06/07/20 20:38:53.916]:loopback ST:                  token-text(",OU=users,O=data")
[06/07/20 20:38:53.919]:loopback ST:    Arg Value: "CN=DVADER,OU=users,O=data".
[06/07/20 20:38:53.922]:loopback ST:                description("Removed by Termination Process")
[06/07/20 20:38:53.926]:loopback ST:             token-text("Removed by Termination Process")
[06/07/20 20:38:53.929]:loopback ST:                  Arg Value: "Removed by Termination Process".
[06/07/20 20:38:53.943]:loopback ST: DirXML Log Event -------------------
     Driver:   \StarWars\system\Driver Set\User Processor
     Channel:  Subscriber
     Status:   Error
     Message:  Code(-9217) Error in

或者也许是这个?

$ awk -v tgt='06/07/20' '
    /^\[/ { prt() }
    NF { rec = rec $0 ORS }
    END { prt() }

    function prt() {
        if ( (index(rec,tgt) == 2) && (rec ~ /Status:[[:space:]]+Error/) ) {
            printf "%s", rec
        }
        rec = ""
    }
' file
[06/07/20 20:38:53.943]:loopback ST: DirXML Log Event -------------------
     Driver:   \StarWars\system\Driver Set\User Processor
     Channel:  Subscriber
     Status:   Error
     Message:  Code(-9217) Error in

您可以轻松地调用 awk find,例如:

find . -type f -name '*.log' -exec awk '....' {} +

相关内容