查找两行之间行数未知的行

Question 1

如果有每个块内没有空行文本，那么您可以sed在每一行之后插入一个空行imported successfully，然后在“段落”（由一个或多个空行分隔的文本块）中处理文件。例如：

sed -e $'/imported successfully/a\\\n' filename |
  perl -00 -n -e 'print if /Failed:/'

另外，您在评论中提到您的输入文件是由for运行的 bash 循环生成的echo <filename> && mongoimport。我建议您将其更改为运行echo <filename> && mongoimport ; echo，以便将来的运行已经将其输出分成段落。 sed不再需要插入换行符，因此您可以运行：

perl -00 -n -e 'print if /Failed:/' filename

Answer

如果有每个块内没有空行文本，那么您可以sed在每一行之后插入一个空行imported successfully，然后在“段落”（由一个或多个空行分隔的文本块）中处理文件。例如：

sed -e $'/imported successfully/a\\\n' filename |
  perl -00 -n -e 'print if /Failed:/'

另外，您在评论中提到您的输入文件是由for运行的 bash 循环生成的echo <filename> && mongoimport。我建议您将其更改为运行echo <filename> && mongoimport ; echo，以便将来的运行已经将其输出分成段落。 sed不再需要插入换行符，因此您可以运行：

perl -00 -n -e 'print if /Failed:/' filename

Question 2

我尝试使用以下带有输出的文本文件，

file_0108.json
2023-02-22T01:15:05.531+0000    connected to: mongodb://[**REDACTED**]@localhost
2023-02-22T01:15:08.531+0000    [######..................] db.coll  64.7MB/255MB (25.4%)
2023-02-22T01:15:11.531+0000    [############............] db.coll  128MB/255MB (50.3%)
2023-02-22T01:15:14.531+0000    [##################......] db.coll  196MB/255MB (76.9%)
2023-02-22T01:15:17.286+0000    [########################] db.coll  255MB/255MB (100.0%)
2023-02-22T01:15:17.286+0000    380757 document(s) imported successfully. 0 document(s) failed to import.
file_0293.json  
2023-02-22T01:52:15.303+0000    connected to: mongodb://[**REDACTED**]@localhost  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46401: invalid character ',' after object key  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46427: invalid character ',' after object key  
2023-02-22T01:52:16.836+0000    46000 document(s) imported successfully. 0 document(s) failed to import.

下面的命令行产生了我认为有用的输出到终端。

$ grep -e 'file_.*\.json' -e 'Failed:' file.txt | sed 's/json/json:/'|grep -B1 'Failed:'
file_0293.json:  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46401: invalid character ',' after object key  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46427: invalid character ',' after object key

如果您愿意，可以将其重定向到一个文件，例如这样以确保将输出打印到标准输出和错误输出... > errors.txt 2>&1，，

grep -e 'file_.*\.json' -e 'Failed:' file.txt | sed 's/json/json:/'|grep -B1 'Failed:' > errors.txt 2>&1

Answer

我尝试使用以下带有输出的文本文件，

file_0108.json
2023-02-22T01:15:05.531+0000    connected to: mongodb://[**REDACTED**]@localhost
2023-02-22T01:15:08.531+0000    [######..................] db.coll  64.7MB/255MB (25.4%)
2023-02-22T01:15:11.531+0000    [############............] db.coll  128MB/255MB (50.3%)
2023-02-22T01:15:14.531+0000    [##################......] db.coll  196MB/255MB (76.9%)
2023-02-22T01:15:17.286+0000    [########################] db.coll  255MB/255MB (100.0%)
2023-02-22T01:15:17.286+0000    380757 document(s) imported successfully. 0 document(s) failed to import.
file_0293.json  
2023-02-22T01:52:15.303+0000    connected to: mongodb://[**REDACTED**]@localhost  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46401: invalid character ',' after object key  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46427: invalid character ',' after object key  
2023-02-22T01:52:16.836+0000    46000 document(s) imported successfully. 0 document(s) failed to import.

下面的命令行产生了我认为有用的输出到终端。

$ grep -e 'file_.*\.json' -e 'Failed:' file.txt | sed 's/json/json:/'|grep -B1 'Failed:'
file_0293.json:  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46401: invalid character ',' after object key  
2023-02-22T01:52:16.836+0000    Failed: error processing document #46427: invalid character ',' after object key

如果您愿意，可以将其重定向到一个文件，例如这样以确保将输出打印到标准输出和错误输出... > errors.txt 2>&1，，

grep -e 'file_.*\.json' -e 'Failed:' file.txt | sed 's/json/json:/'|grep -B1 'Failed:' > errors.txt 2>&1

Question 3

使用awk：

awk -v startblock='^file_[0-9][0-9][0-9][0-9]\\.json$' \
    -v endblock='document\\(s\\) failed to import\\.$' '
    $0 ~ startblock {
        error=0
        s=""
    }
    {
        s=(s=="" ? "" : s ORS) $0
    }
    $0 ~ endblock && (error || $0 !~ " 0 " endblock) {
        print s
        next
    }
    tolower($0) ~ /failed|error|invalid/ {
        error=1
    }
' file

这将打印包含不区分大小写的匹配的所有块failed，error或invalid在块的开始和结束之间或块行的末尾包含非零的n document(s) failed to import.位置。n

Answer

使用awk：

awk -v startblock='^file_[0-9][0-9][0-9][0-9]\\.json$' \
    -v endblock='document\\(s\\) failed to import\\.$' '
    $0 ~ startblock {
        error=0
        s=""
    }
    {
        s=(s=="" ? "" : s ORS) $0
    }
    $0 ~ endblock && (error || $0 !~ " 0 " endblock) {
        print s
        next
    }
    tolower($0) ~ /failed|error|invalid/ {
        error=1
    }
' file

这将打印包含不区分大小写的匹配的所有块failed，error或invalid在块的开始和结束之间或块行的末尾包含非零的n document(s) failed to import.位置。n

Question 4

要使用任何 awk 执行您要求的操作，请执行以下操作：

awk '
    /^file_[0-9]+\.json$/ {
        printf "%s", rec
        rec = ""
    }
    { rec = rec $0 ORS }
    /document\(s) imported successfully. 0 document\(s) failed to import/ {
        rec = ""
    }
    END { printf "%s", rec }
' file

但您发布的示例输入与您的要求不符。我认为你可能真正想要的是（再次使用任何 awk）：

awk '
    /^file_[0-9]+\.json$/ {
        if ( !bad ) printf "%s", rec
        rec = bad = ""
    }
    /Failed:/ { bad = 1 }
    { rec = rec $0 ORS }
    END { if ( !bad ) printf "%s", rec }
' file

Answer

要使用任何 awk 执行您要求的操作，请执行以下操作：

awk '
    /^file_[0-9]+\.json$/ {
        printf "%s", rec
        rec = ""
    }
    { rec = rec $0 ORS }
    /document\(s) imported successfully. 0 document\(s) failed to import/ {
        rec = ""
    }
    END { printf "%s", rec }
' file

但您发布的示例输入与您的要求不符。我认为你可能真正想要的是（再次使用任何 awk）：

awk '
    /^file_[0-9]+\.json$/ {
        if ( !bad ) printf "%s", rec
        rec = bad = ""
    }
    /Failed:/ { bad = 1 }
    { rec = rec $0 ORS }
    END { if ( !bad ) printf "%s", rec }
' file

查找两行之间行数未知的行

答案1

答案2

答案3

答案4

相关内容