使用 Bash 的惰性正则表达式

Question 1

互联网上充斥着各种替代方法是有原因的。我真的无法想象你会遇到什么情况被迫为此使用 bash。为什么不使用专为这项工作设计的工具之一呢？

无论如何，据我所知，没有办法使用=~运算符进行非贪婪匹配。这是因为它不使用 bash 的内部正则表达式引擎，而是使用系统的 C 引擎，如man 3 regex.这在man bash：

   An additional binary operator, =~, is available, with the  same  prece‐
   dence  as  ==  and !=.  When it is used, the string to the right of the
   operator is considered  an  extended  regular  expression  and  matched
   accordingly  (as  in  regex(3)).

但是，您或多或少可以做您想做的事情（请记住，这实际上是不是解析 HTML 文件的好方法），正则表达式略有不同：

string='<span class="circle"> </span>foo</span></span>'
regex='<span class="circle"> </span>([^<]+)</span>'
[[ $string =~ $regex ]]; 
echo "${BASH_REMATCH[1]}"

以上foo将按预期返回。

Answer

互联网上充斥着各种替代方法是有原因的。我真的无法想象你会遇到什么情况被迫为此使用 bash。为什么不使用专为这项工作设计的工具之一呢？

无论如何，据我所知，没有办法使用=~运算符进行非贪婪匹配。这是因为它不使用 bash 的内部正则表达式引擎，而是使用系统的 C 引擎，如man 3 regex.这在man bash：

   An additional binary operator, =~, is available, with the  same  prece‐
   dence  as  ==  and !=.  When it is used, the string to the right of the
   operator is considered  an  extended  regular  expression  and  matched
   accordingly  (as  in  regex(3)).

但是，您或多或少可以做您想做的事情（请记住，这实际上是不是解析 HTML 文件的好方法），正则表达式略有不同：

string='<span class="circle"> </span>foo</span></span>'
regex='<span class="circle"> </span>([^<]+)</span>'
[[ $string =~ $regex ]]; 
echo "${BASH_REMATCH[1]}"

以上foo将按预期返回。

Question 2

我不知道bash的正则表达式是否像Perl一样非贪婪匹配，所以使用Perl正则表达式引擎：

$ grep -oP '<span class="circle"> </span>\K.+?(?=</span>)' <<<"$string"
foo

Answer

我不知道bash的正则表达式是否像Perl一样非贪婪匹配，所以使用Perl正则表达式引擎：

$ grep -oP '<span class="circle"> </span>\K.+?(?=</span>)' <<<"$string"
foo

使用 Bash 的惰性正则表达式

答案1

答案2

相关内容