在正则表达式中打印反向引用

Question 1

我对 sed 不太了解，无法回答，但如果你能灵活地使用 grep：

grep --only-matching "complex_regex" file

或者

grep -o "complex_regex" file

这--only-matching（或简称-o) 标志告诉 grep 仅打印出匹配的部分，而不是整行。

Answer

我对 sed 不太了解，无法回答，但如果你能灵活地使用 grep：

grep --only-matching "complex_regex" file

或者

grep -o "complex_regex" file

这--only-matching（或简称-o) 标志告诉 grep 仅打印出匹配的部分，而不是整行。

Question 2

您的第一个 .* 停在“day”，从而使您的反向引用为空。您需要在反向引用中的 [[:alpha:]] 之前找到一些明确的匹配项。例如一个空格，

$ echo $regex
\([[:alpha:]]*\)day

$ echo $phrase
it is Saturday tomorrow

$ echo $phrase | sed "s/.* $regex.*/\1/"
Satur

我又爱又恨正则表达式。

编辑：

字边界非 POSIX 扩展 (\b) 似乎可以捕捉到两种情况：

$ regex="\b\([[:alpha:]]\+\)day\b"

我不确定如何处理模式出现多次或模式中有多个单词的情况。

$ cat phrase.txt
it is Saturday tomorrow
it is   Saturday tomorrow
Saturday is the date tomorrow
        Saturday is the date tomorrow
Saturday is the day tomorrow
        Saturday is the day tomorrow
Saturday is the day in dayton tomorrow
        Saturday is the day in dayton tomorrow
Saturday is the day after Friday
The last day of the week is Friday

$ cat phrase.txt | sed -e "s/.*$regex.*/\1/"
Satur
Satur
Satur
Satur
Satur
Satur
Satur
Satur
Satur
Fri

我很好奇是否有人对 sed-fu 有更多的了解，可以给出更好的答案。:-)

Answer

您的第一个 .* 停在“day”，从而使您的反向引用为空。您需要在反向引用中的 [[:alpha:]] 之前找到一些明确的匹配项。例如一个空格，

$ echo $regex
\([[:alpha:]]*\)day

$ echo $phrase
it is Saturday tomorrow

$ echo $phrase | sed "s/.* $regex.*/\1/"
Satur

我又爱又恨正则表达式。

编辑：

字边界非 POSIX 扩展 (\b) 似乎可以捕捉到两种情况：

$ regex="\b\([[:alpha:]]\+\)day\b"

我不确定如何处理模式出现多次或模式中有多个单词的情况。

$ cat phrase.txt
it is Saturday tomorrow
it is   Saturday tomorrow
Saturday is the date tomorrow
        Saturday is the date tomorrow
Saturday is the day tomorrow
        Saturday is the day tomorrow
Saturday is the day in dayton tomorrow
        Saturday is the day in dayton tomorrow
Saturday is the day after Friday
The last day of the week is Friday

$ cat phrase.txt | sed -e "s/.*$regex.*/\1/"
Satur
Satur
Satur
Satur
Satur
Satur
Satur
Satur
Satur
Fri

我很好奇是否有人对 sed-fu 有更多的了解，可以给出更好的答案。:-)

Question 3

这与 mgjk 的答案很接近，但边界匹配的方法略有不同。

echo $phrase | sed 's/.*[^[:alpha:]]\([[:alpha:]]*\)day.*/\1/'
Satur

因为.*会吞下任何东西，所以你必须先匹配“不是我想要的字符”，然后是“我想要的字符”。因此，$regex你可以存储

[^[:alpha:]]\([[:alpha:]]*\)day

它并非没有缺点（如果“星期六”是行首，则当前形式不起作用），但如果您打算使用 justsed而不是更强大的工具，那么它可能对您来说就足够了。您也可以使用两部分正则表达式来解决“行首”问题，但随后它又开始变得更加复杂，这是您不想要的。如果您的标准发生变化，则存在许多解决方案。

Answer

这与 mgjk 的答案很接近，但边界匹配的方法略有不同。

echo $phrase | sed 's/.*[^[:alpha:]]\([[:alpha:]]*\)day.*/\1/'
Satur

因为.*会吞下任何东西，所以你必须先匹配“不是我想要的字符”，然后是“我想要的字符”。因此，$regex你可以存储

[^[:alpha:]]\([[:alpha:]]*\)day

它并非没有缺点（如果“星期六”是行首，则当前形式不起作用），但如果您打算使用 justsed而不是更强大的工具，那么它可能对您来说就足够了。您也可以使用两部分正则表达式来解决“行首”问题，但随后它又开始变得更加复杂，这是您不想要的。如果您的标准发生变化，则存在许多解决方案。

Question 4

我问了这个问题所以以及，并得到这potong 的回答正是我想要的。

sed '/'"$regex"'/!b;s//\n\1\n/;s/.*\n\(.*\)\n.*/\1/' file

请注意，它不依赖于对内容的了解$regex。它使用换行符作为标记值，以便稍后仅使用反向引用替换整行。

Answer

我问了这个问题所以以及，并得到这potong 的回答正是我想要的。

sed '/'"$regex"'/!b;s//\n\1\n/;s/.*\n\(.*\)\n.*/\1/' file

请注意，它不依赖于对内容的了解$regex。它使用换行符作为标记值，以便稍后仅使用反向引用替换整行。

在正则表达式中打印反向引用

答案1

答案2

答案3

答案4

相关内容