awk 中的行为不一致

Question 1

来自 (g)awks 手册页：

~ !~        Regular expression match, negated match.  NOTE: Do not use a constant regular  expression  (/foo/)
            on  the left-hand side of a ~ or !~.  Only use one on the right-hand side.  The expression /foo/ ~
            exp has the same meaning as (($0 ~ /foo/) ~ exp).  This is usually not what you want.

如果您以明确告诉您不要的方式使用它，您预计会发生什么？

Answer

来自 (g)awks 手册页：

~ !~        Regular expression match, negated match.  NOTE: Do not use a constant regular  expression  (/foo/)
            on  the left-hand side of a ~ or !~.  Only use one on the right-hand side.  The expression /foo/ ~
            exp has the same meaning as (($0 ~ /foo/) ~ exp).  This is usually not what you want.

如果您以明确告诉您不要的方式使用它，您预计会发生什么？

Question 2

事实上，这是一个有趣的问题。 @tink 指出了为什么你的代码不能按预期工作，但这不是问题。问题是“为什么0有时会匹配”。

如果(/foo/ ~ $1)确实意味着(($0 ~ /foo/) ~ $1)，($0 ~ /foo/)则将评估1该行是否包含foo，0否则。因此，您（主要）正在评估0 ~ $1.如果输入行为空，则$1 == ""、和空正则表达式始终匹配。如果输入行恰好为0，则$1和也0 ~ 0为 true。000例如，如果输入行是，那么也是$1，并且0 ~ 000不应该为真。但是，很可能会在检查匹配之前将其000转换为。0

但不幸的是，这个解释并没有涵盖所有情况。

情况1

0 <-- found match
a
0 <-- found match
0 <-- found match

这完全符合预期。

案例2

0 <-- found match
00 00 <-- found match
0 <-- found match

这也是预期的，只要任意数量的零都被解释为0。但现在，这个：

案例3

0 <-- found match
a
00 0
0

这不能这么简单地解释掉。匹配失败后，似乎不会发生到零的转换，并且后面应该匹配的行也不会发生。

案例4

0 <-- found match
a
00 00
a
0 <-- found match

无论发生什么，另一次失败的匹配似乎会将awk的行为重置为正常，并且匹配再次按预期进行。

总而言之，GNU 手册页中的解释（awk顺便说一下，它不是信息页的一部分）是不正确的（或者至少是不完整的），或者程序包含错误。

Answer