打印第一个匹配括号之间的内容

Question 1

一种简单的暴力方法，可以在所有 Unix 机器上的任何 shell 中的任何 awk 中工作：

$ cat tst.awk
s=index($0,"START") { $0=substr($0,s); f=1 }
f { rec = rec $0 RS }
END {
    len = length(rec)
    for (i=1; i<=len; i++) {
        char = substr(rec,i,1)
        if ( char == "{" ) {
            ++cnt
        }
        else if ( char == "}" ) {
            if ( --cnt == 0 ) {
                print substr(rec,1,i)
                exit
            }
        }
    }
}

$ awk -f tst.awk file
START{
    some text

    {
      more text}
almost there
}

Answer

一种简单的暴力方法，可以在所有 Unix 机器上的任何 shell 中的任何 awk 中工作：

$ cat tst.awk
s=index($0,"START") { $0=substr($0,s); f=1 }
f { rec = rec $0 RS }
END {
    len = length(rec)
    for (i=1; i<=len; i++) {
        char = substr(rec,i,1)
        if ( char == "{" ) {
            ++cnt
        }
        else if ( char == "}" ) {
            if ( --cnt == 0 ) {
                print substr(rec,1,i)
                exit
            }
        }
    }
}

$ awk -f tst.awk file
START{
    some text

    {
      more text}
almost there
}

Question 2

和pcregrep：

start_word='START'
pcregrep -Mo "(?s)\Q$start_word\E\h*(\{(?:[^{}]++|(?1))*+\})" < your-file

使用zsh内置函数：

set -o rematchpcre
start_word='START'
[[ $(<your-file) =~ "(?s)\Q$start_word\E\h*(\{(?:[^{}]++|(?1))*+\})" ]] &&
  print -r -- $MATCH

它们使用 PCRE 的递归正则表达式功能，上面(?1)回顾了第一(...)对中的正则表达式。

如果你既没有pcregrep也没有zsh，您可以随时求助于真实的事情（ PCRE 中perl的P）：

perl -l -0777 -sne '
    print $& if /\Q$start_word\E\h*(\{(?:[^{}]++|(?1))*+\})/s
  ' -- -start_word='START' < your-file

（请注意，除了其中perl一个之外，所有其他都假设$start_word不包含\E）。

Answer