发出命令打印 aa.txt 文件中包含单词“Unix”或“unix”的行。尝试使用 grep、awk 和 sed 命令（3 个不同的命令）

Question 1

与字符串Unix和unixwill be相匹配的正则表达式[Uu]nix，其中匹配or[Uu]之一。Uu

您可以使用如下三个工具来提取与此表达式匹配的所有行：

awk '/[Uu]nix/' file

这是一个awk“简短形式”的程序，它使用以下事实：awk如果给定条件匹配，则打印当前记录（行）。 “长手”变体，包含所有不必要的代码，看起来像awk '$0 ~ /[Uu]nix/ { print $0 }' file.
grep '[Uu]nix' file

该grep实用程序只是提取与给定表达式匹配的行。
sed -n '/[Uu]nix/p' file

此sed命令关闭（使用-n）每行的默认打印。然后它显式地仅打印与给定表达式匹配的行。

使用sed，您还可以选择删除我们不想看到的行并使用每行的默认打印来输出留下的行： sed '/[Uu]nix/!d' file

在awk、grep和中sed，该grep实用程序最适合提取与特定表达式匹配的行的任务。您最常用于awk需要更多处理或汇总的任务，以及sed很少或不需要保持状态的单行修改（但所有三个工具在其应用程序中都有重叠）。

Answer

与字符串Unix和unixwill be相匹配的正则表达式[Uu]nix，其中匹配or[Uu]之一。Uu

您可以使用如下三个工具来提取与此表达式匹配的所有行：

awk '/[Uu]nix/' file

这是一个awk“简短形式”的程序，它使用以下事实：awk如果给定条件匹配，则打印当前记录（行）。 “长手”变体，包含所有不必要的代码，看起来像awk '$0 ~ /[Uu]nix/ { print $0 }' file.
grep '[Uu]nix' file

该grep实用程序只是提取与给定表达式匹配的行。
sed -n '/[Uu]nix/p' file

此sed命令关闭（使用-n）每行的默认打印。然后它显式地仅打印与给定表达式匹配的行。

使用sed，您还可以选择删除我们不想看到的行并使用每行的默认打印来输出留下的行： sed '/[Uu]nix/!d' file

在awk、grep和中sed，该grep实用程序最适合提取与特定表达式匹配的行的任务。您最常用于awk需要更多处理或汇总的任务，以及sed很少或不需要保持状态的单行修改（但所有三个工具在其应用程序中都有重叠）。

Question 2

这个问题更棘手的部分是“匹配单词Unix 或 Unix”

使用输入文件

$ cat -n file
     1  how do I pick them? both "Unix" and 'unix'
     2  Could be just Unix
     3  or just
     4  unix at the start of line
     5  do not match unixy or munix

我们应该匹配第 1、2 和 4 行，但不匹配第 5 行，因为“unix”不会“作为一个单词”出现。

还演示了这些工具如何通过内置设施来进行不区分大小写的匹配。

grep

$ grep -i unix file
how do I pick them? both "Unix" and 'unix'
Could be just Unix
unix at the start of line
do not match unixy or munix

现在添加-w（“整个单词”）选项：

$ grep -i -w unix file
how do I pick them? both "Unix" and 'unix'
Could be just Unix
unix at the start of line

GNU sed

$ gsed -n '/unix/I p' file
how do I pick them? both "Unix" and 'unix'
Could be just Unix
unix at the start of line
do not match unixy or munix

现在添加 GNU regexp 单词边界标记

$ gsed -n '/\<unix\>/I p' file
how do I pick them? both "Unix" and 'unix'
Could be just Unix
unix at the start of line

（我在 Mac 上通过gsedHomebrew 安装了 GNU sed）

GNU awk

$ gawk -v IGNORECASE=1 '/unix/' file
how do I pick them? both "Unix" and 'unix'
Could be just Unix
unix at the start of line
do not match unixy or munix

$ gawk -v IGNORECASE=1 '/\<unix\>/' file
how do I pick them? both "Unix" and 'unix'
Could be just Unix
unix at the start of line

非 GNU 工具：例如 Mac 上默认的 awk 和 sed

这些工具不使用 GNU 正则表达式，也没有方便的\< \>字边界。不区分大小写的匹配也不可用。结果并不那么漂亮。

/usr/bin/sed -En '/(^|[^_[:alnum:]])[Uu]nix($|[^_[:alnum:]])/ p' file
/usr/bin/awk 'tolower($0) ~ /(^|[^_[:alnum:]])unix($|[^_[:alnum:]])/' file
/usr/bin/awk -F'[^[:alpha:]]+' '{for (i=1; i<=NF; i++) if (tolower($i) == "unix") {print; next}}' file

Answer