sed - 如何（不）匹配不匹配的括号

Question 1

就我个人而言，如果我的正则表达式接近这种复杂程度，我会将整个操作切换到 Perl。这一个处理任意数量的左大括号/圆括号/大括号：

$ perl -ne '@open=/[\[({]/g; @close=/[)\]}]/g; 
             if($#close == $#open){s/(.+?)\.is/($1).is/} print' file

或者，更紧凑：

$ perl -pne 's/(.+?)\.is/($1).is/ if $#{/[\[({]/g} == $#{/[)\]}]/g}' file

或者更完整的是，这个可以处理类似的情况[}（但在类似的情况下仍然失败)(）：

  $ perl -pne '@osqb=/\[/g; @csqb=/\]/g; 
               @ocb=/\{/g; @ccb=/\}/g; 
               @op=/\(/g; @cp=/\)/g;
               if($#osqb == $#csqb && $#ocb==$#ccb && $#op == $#cp){
                    s/(.+?)\.is/($1).is/
               }' file

当在您的示例上运行时，这将打印

(this_thing).is 24
(that).is 50      
(a[23]).is == 10 
(a).is true      
(this_thing).is 24
this_thing.is (24
((that).is 50
(a[23].is == 10
a.is ( true
(this_thing.is 24
a{.is true
this_thing{.is 24
a[.is true
this_thing[.is 24

解释

perl -ne：逐行处理输入文件 ( -n) 并运行给出的脚本-e。
@open=/[\[({]/g;：找到所有打开的字形并将结果保存在名为的数组中@open。
@close=/[)\]}]/g;: 如上所述，但用于关闭字形。
if($#close == $#open)：如果左字形的数量等于右字形的数量（换句话说，如果有挂括号等）...
s/(.+?)\.is/($1).is/.is: ...然后替换以括号内的自身结尾的最短字符串。
最后一个print在括号外，无论是否有替换都会被执行。

Answer

就我个人而言，如果我的正则表达式接近这种复杂程度，我会将整个操作切换到 Perl。这一个处理任意数量的左大括号/圆括号/大括号：

$ perl -ne '@open=/[\[({]/g; @close=/[)\]}]/g; 
             if($#close == $#open){s/(.+?)\.is/($1).is/} print' file

或者，更紧凑：

$ perl -pne 's/(.+?)\.is/($1).is/ if $#{/[\[({]/g} == $#{/[)\]}]/g}' file

或者更完整的是，这个可以处理类似的情况[}（但在类似的情况下仍然失败)(）：

  $ perl -pne '@osqb=/\[/g; @csqb=/\]/g; 
               @ocb=/\{/g; @ccb=/\}/g; 
               @op=/\(/g; @cp=/\)/g;
               if($#osqb == $#csqb && $#ocb==$#ccb && $#op == $#cp){
                    s/(.+?)\.is/($1).is/
               }' file

当在您的示例上运行时，这将打印

(this_thing).is 24
(that).is 50      
(a[23]).is == 10 
(a).is true      
(this_thing).is 24
this_thing.is (24
((that).is 50
(a[23].is == 10
a.is ( true
(this_thing.is 24
a{.is true
this_thing{.is 24
a[.is true
this_thing[.is 24

解释

perl -ne：逐行处理输入文件 ( -n) 并运行给出的脚本-e。
@open=/[\[({]/g;：找到所有打开的字形并将结果保存在名为的数组中@open。
@close=/[)\]}]/g;: 如上所述，但用于关闭字形。
if($#close == $#open)：如果左字形的数量等于右字形的数量（换句话说，如果有挂括号等）...
s/(.+?)\.is/($1).is/.is: ...然后替换以括号内的自身结尾的最短字符串。
最后一个print在括号外，无论是否有替换都会被执行。

Question 2

扩展 terdon 的答案，您可以使用 Perl 来真正解析嵌套的括号结构。这是一个应该执行此操作的正则表达式：

$balanced_parens_grammar = qr/
  (?(DEFINE)                         # Define a grammar
    (?<BALANCED_PARENS> 
       \(                            # Opening paren
          (?:                        # Group without capturing
              (?&BALANCED_PARENS)    # Nested balanced parens
             |(?&BALANCED_BRACKETS)  # Nested balanced brackets
             |(?&BALANCED_CURLIES)   # Nested balanced curlies
             |[^)]*                  # Any non-closing paren
           )                         # End alternation
        \)                           # Closing paren
    )
    (?<BALANCED_BRACKETS> 
       \[                            # Opening bracket
          (?:                        # Group without capturing
              (?&BALANCED_PARENS)    # Nested balanced parens
             |(?&BALANCED_BRACKETS)  # Nested balanced brackets
             |(?&BALANCED_CURLIES)   # Nested balanced curlies
             |[^\]]*                 # Any non-closing bracket
           )                         # End alternation
        \]                           # Closing bracket
    )
    (?<BALANCED_CURLIES> 
       {                             # Opening curly
          (?:                        # Group without capturing
              (?&BALANCED_PARENS)    # Nested balanced parens
             |(?&BALANCED_BRACKETS)  # Nested balanced brackets
             |(?&BALANCED_CURLIES)   # Nested balanced curlies
             |[^}]*                  # Any non-closing curly
           )                         # End alternation
        }                            # Closing curly
    )
  )
  (?<BALANCED_ANY>
     (?:
         (?&BALANCED_PARENS)    
        |(?&BALANCED_BRACKETS)  
        |(?&BALANCED_CURLIES)   
     )
  )
/x;

像这样使用它：

if( $line =~ m/
        ^
          [^()\[\]{}]*       # Any non-parenthetical punctuation
          (?&BALANCED_ANY)?  # Any balanced paren-types
          [^()\[\]{}]*
        $
        $balanced_parens_grammar/x){
    # Do your magic here
}

免责声明

代码完全未经测试。可能包含错误。

Answer