从与pattern1最接近的匹配处打印pattern2之前的所有行

Question 1

关于keeping a pattern space comprising all text since the last match "Compiling", but there can be thousands of lines without an error. Would it be very inefficient?- 它可能不会比任何替代方法效率低，例如在开始打印之前对输入文件进行两次传递以识别匹配的分隔符对，并且它的优点是无论输入是否存储在文件或来自管道。

如果您所在的系统具有以下功能，那么最有效的方法可能就是在中间使用 2 次 with 调用tac：awktac

$ tac file |
    awk '/^error:/{f=1; print "---separator---"} f; /^Compiling/{f=0}' |
        tac
Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

否则，只需在每个 Unix 机器上的任何 shell 中使用任何 awk：

$ awk '
    /^Compiling/ { buf="" }
    { buf = buf $0 "\n" }
    /^error:/ { print buf "---separator---" }
' file
Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

或者，使用 GNU awk 进行多字符 RS 和 RT：

$ awk -v RS='\nerror:[^\n]+' -v ORS='\n---separator---\n' '
    sub(/(^|.*\n)Compiling/,"Compiling") { print $0 RT }
' file
Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

Answer

关于keeping a pattern space comprising all text since the last match "Compiling", but there can be thousands of lines without an error. Would it be very inefficient?- 它可能不会比任何替代方法效率低，例如在开始打印之前对输入文件进行两次传递以识别匹配的分隔符对，并且它的优点是无论输入是否存储在文件或来自管道。

如果您所在的系统具有以下功能，那么最有效的方法可能就是在中间使用 2 次 with 调用tac：awktac

$ tac file |
    awk '/^error:/{f=1; print "---separator---"} f; /^Compiling/{f=0}' |
        tac
Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

否则，只需在每个 Unix 机器上的任何 shell 中使用任何 awk：

$ awk '
    /^Compiling/ { buf="" }
    { buf = buf $0 "\n" }
    /^error:/ { print buf "---separator---" }
' file
Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

或者，使用 GNU awk 进行多字符 RS 和 RT：

$ awk -v RS='\nerror:[^\n]+' -v ORS='\n---separator---\n' '
    sub(/(^|.*\n)Compiling/,"Compiling") { print $0 RT }
' file
Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

Question 2

使用perl它非常简单，因为它有一个段落模式-00：

perl -00 -ne 'print if /\nerror:/' file

输出：

Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A

Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B

如果添加| sed 's/^$/----separator----/'，您还可以根据需要添加自己的分隔符而不是空行。

Answer

使用perl它非常简单，因为它有一个段落模式-00：

perl -00 -ne 'print if /\nerror:/' file

输出：

Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A

Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B

如果添加| sed 's/^$/----separator----/'，您还可以根据需要添加自己的分隔符而不是空行。

Question 3

使用 Raku（以前称为 Perl_6）

raku -e 'my @array; for slurp.split("\n\n") {@array.push($_)}; for @array {.put if /^Compiling .* \n error/};'

输入示例：

Compiling File1
... commands ...

Compiling File2
... commands ...

Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A

Compiling File4
... commands ...

Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B

输出示例 (1)：

Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B

简而言之，“ Compiling...”部分在分隔符上被打破\n\n，每个元素都被推送到@array（通过$_“topic”变量）。仅当结果元素以... 开头并且从最后一行开始具有 ...@array时才会打印。Compilingerror

目前尚不清楚为什么OP要求一行---separator---（因为起始行和结束行都已明确指定），但是很容易添加：

raku -e 'my @array; for slurp.split("\n\n") {@array.push($_)}; for @array {put($_,"\n---separator---") if /^Compiling .* \n error/};'

示例输出 (2)：

Compiling File3
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find A
---separator---
Compiling File5
... commands ...
In file included from ...
In file included from ...
In file included from ...
error: could not find B
---separator---

附录：OP 在评论中提到内存效率是关键。在 Raku 中，lines例程是惰性的，因此这是一种粗略的方法（目前每个“编译...错误”块在一行上返回）：

raku -e 'for lines.split( "Compiling ") {say "ERROR Compiling "~$_ if m/error/};'

或者

raku -e 'say "ERROR Compiling $_" if m/error/ for lines.split( "Compiling ");'

输出示例 (3)：

ERROR Compiling File3 ... commands ... In file included from ... In file included from ... In file included from ... error: could not find A  
ERROR Compiling File5 ... commands ... In file included from ... In file included from ... In file included from ... error: could not find B

https://speakerdeck.com/util/reading-files-cant-be-this-simple
https://raku.org

Answer