如何为在一行中多次出现的正则表达式创建 grep

Question 1

和perl：

perl -lne '
  if (@ips = /\[(\d{1,3}(?:\.\d{1,3}){3})\]/g) {
    print join ",", @ips;
  } else {
    print "n.a.";
  }'

Regexp::Common或者使用来自（libregexp-common-perl基于 Debian 的系统（例如 Ubuntu）上的软件包）的点分四进制 IPv4 地址的正则表达式：

perl -MRegexp::Common=net -lne '
  if (@ips = /\[($RE{net}{IPv4})\]/g)
    print join ",", @ips;
  } else {
    print "n.a.";
  }'

使用时-n，输入可以在 stdin 上给出，也可以从路径作为额外参数给出的文件中读取，或者如果传递了像之类的参数，则可以从某些命令的输出中读取some commands|。默认情况下，perl它会print打印到 stdout，您可以从 shell 重定向到带有重定向运算符的文件，例如>、>>（追加）或1<>（>除了文件不会首先被截断，并且以读+写模式打开），以及可能更多取决于你的外壳。

您还可以添加-i输出选项，以最终替换输入文件的内容（然后必须将其路径作为参数给出）。

在这里，从名为的文件中获取输入，并使用输出input.txt覆盖或创建一个文件：output.csv

< input.txt perl... > output.csv

Answer

和perl：

perl -lne '
  if (@ips = /\[(\d{1,3}(?:\.\d{1,3}){3})\]/g) {
    print join ",", @ips;
  } else {
    print "n.a.";
  }'

Regexp::Common或者使用来自（libregexp-common-perl基于 Debian 的系统（例如 Ubuntu）上的软件包）的点分四进制 IPv4 地址的正则表达式：

perl -MRegexp::Common=net -lne '
  if (@ips = /\[($RE{net}{IPv4})\]/g)
    print join ",", @ips;
  } else {
    print "n.a.";
  }'

使用时-n，输入可以在 stdin 上给出，也可以从路径作为额外参数给出的文件中读取，或者如果传递了像之类的参数，则可以从某些命令的输出中读取some commands|。默认情况下，perl它会print打印到 stdout，您可以从 shell 重定向到带有重定向运算符的文件，例如>、>>（追加）或1<>（>除了文件不会首先被截断，并且以读+写模式打开），以及可能更多取决于你的外壳。

您还可以添加-i输出选项，以最终替换输入文件的内容（然后必须将其路径作为参数给出）。

在这里，从名为的文件中获取输入，并使用输出input.txt覆盖或创建一个文件：output.csv

< input.txt perl... > output.csv

Question 2

使用 GNU awk 进行 FPAT：

awk -v FPAT='\\[([0-9]{1,3}[.]){3}[0-9]{1,3}\\]' -v OFS=, '
{
    $1=$1; print (gsub(/[][]/, "")?$0:"N/A")
}' <infile >output

或使用任何 POSIX awk（都支持{x,y}RE 间隔）：

awk '
{
    bkup=$0;
    gsub(/\[([0-9]{1,3}[.]){3}[0-9]{1,3}\]/, "|")
    gsub(/[][()\\.{}?+*$^]/, "\\\\&")
    n=split(bkup, tmp, $0)
    for(i=1; i<=n; i++){
        if(tmp[i]!=""){
            gsub(/[][]/, "", tmp[i])
            printf ("%s", (sep?",":"") tmp[i])
            sep=","
        }
    }; print (sep?"":"N/A"); sep=""
}' <infile >output

输出写入文件output。

$ cat output
11.335.2.33,43.22.11.88,55.66.77.88
66.223.44.33
N/A
1.2.33.3,1.32.2.4

请注意，对于第二种方法，您的输入不应包含|和字符。&

带有内联解释的相同代码：

awk '
{
    #backup from the current record
    bkup=$0;

    #replace desired pattern all with "|" characters 
    #to build regexp patterns of everything other than our desired pattern
    gsub(/\[([0-9]{1,3}[.]){3}[0-9]{1,3}\]/, "|")

    #escape all regexp operators except "|"
    gsub(/[][()\\.{}?+*$^]/, "\\\\&")

    #split the original record (from bkup) into tmp on regexp
    # from the result of the first gsub() above
    n=split(bkup, tmp, $0)

    #loop through the splitted fields on the tmp array
    for(i=1; i<=n; i++){

        #if the current field is not empty
        if(tmp[i]!=""){

            #remove the ], [ characters from it
            gsub(/[][]/, "", tmp[i])

            #and print it (will add comma when it is the second or the next one)
            printf ("%s", (sep?",":"") tmp[i])

            #set comma as the field seperator when at least one field was printed
            sep=","
        }

    #print "N/A" in case there was no field and var "sep" did not set above
    # and then unset the "sep" var
    }; print (sep?"":"N/A"); sep=""

}' <infile >output

Answer

使用 GNU awk 进行 FPAT：

awk -v FPAT='\\[([0-9]{1,3}[.]){3}[0-9]{1,3}\\]' -v OFS=, '
{
    $1=$1; print (gsub(/[][]/, "")?$0:"N/A")
}' <infile >output

或使用任何 POSIX awk（都支持{x,y}RE 间隔）：

awk '
{
    bkup=$0;
    gsub(/\[([0-9]{1,3}[.]){3}[0-9]{1,3}\]/, "|")
    gsub(/[][()\\.{}?+*$^]/, "\\\\&")
    n=split(bkup, tmp, $0)
    for(i=1; i<=n; i++){
        if(tmp[i]!=""){
            gsub(/[][]/, "", tmp[i])
            printf ("%s", (sep?",":"") tmp[i])
            sep=","
        }
    }; print (sep?"":"N/A"); sep=""
}' <infile >output

输出写入文件output。

$ cat output
11.335.2.33,43.22.11.88,55.66.77.88
66.223.44.33
N/A
1.2.33.3,1.32.2.4

请注意，对于第二种方法，您的输入不应包含|和字符。&

带有内联解释的相同代码：

awk '
{
    #backup from the current record
    bkup=$0;

    #replace desired pattern all with "|" characters 
    #to build regexp patterns of everything other than our desired pattern
    gsub(/\[([0-9]{1,3}[.]){3}[0-9]{1,3}\]/, "|")

    #escape all regexp operators except "|"
    gsub(/[][()\\.{}?+*$^]/, "\\\\&")

    #split the original record (from bkup) into tmp on regexp
    # from the result of the first gsub() above
    n=split(bkup, tmp, $0)

    #loop through the splitted fields on the tmp array
    for(i=1; i<=n; i++){

        #if the current field is not empty
        if(tmp[i]!=""){

            #remove the ], [ characters from it
            gsub(/[][]/, "", tmp[i])

            #and print it (will add comma when it is the second or the next one)
            printf ("%s", (sep?",":"") tmp[i])

            #set comma as the field seperator when at least one field was printed
            sep=","
        }

    #print "N/A" in case there was no field and var "sep" did not set above
    # and then unset the "sep" var
    }; print (sep?"":"N/A"); sep=""

}' <infile >output

Question 3

可执行 awk 文件filter.awk：

#! /usr/bin/awk -f
{
    ret = ""
    line = $0
    while (match(line, /\[([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}\]/) > 0) {
        if (ret != "") {
            ret = ret ","
        }
        ret = ret substr(line, RSTART, RLENGTH)
        line = substr(line, RSTART + RLENGTH)
    }
    if (ret != "") {
        print ret
    }
}

像这样执行：

./filter.awk filename

Answer

可执行 awk 文件filter.awk：

#! /usr/bin/awk -f
{
    ret = ""
    line = $0
    while (match(line, /\[([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}\]/) > 0) {
        if (ret != "") {
            ret = ret ","
        }
        ret = ret substr(line, RSTART, RLENGTH)
        line = substr(line, RSTART + RLENGTH)
    }
    if (ret != "") {
        print ret
    }
}

像这样执行：

./filter.awk filename

Question 4

使用乐（以前称为 Perl_6）

没有值检查：

raku -ne 'if m:g/ ( \d**1..3 )**4 % "." / { $/.join(",").put } else {"n.a.".say};'

或者

raku -ne 'm:g/ ( \d**1..3 )**4 % "." / ?? $/.join(",").put !! "n.a.".say;'

输入示例：

blabla [11.335.2.33] xyuoeretrete [43.22.11.88] jfdfjkfbs [55.66.77.88]
blabla [66.223.44.33]
foo bar
blabla [1.2.33.3] xyuoeretrete [42] bla[1.32.2.4]

示例输出（两个示例）：

11.335.2.33,43.22.11.88,55.66.77.88
66.223.44.33
n.a.
1.2.33.3,1.32.2.4

至少在 Perl 系列语言中，您需要的是 a match，而不是 grep 。因此使用m/.../匹配运算符，使“全局”m:g/.../返回匹配的多个实例[这与 grep 不同，后者返回完整元素（例如行等），含有一场比赛]。

\d**1..3简而言之，搜索1 到 3 个数字的簇 ( )，这些数字**4重复 4 次，% "."每个实例之间有一个句点，并且全局搜索此正则表达式匹配 (m:global或m:g)，这意味着获取该数字的所有实例每个元素（行等）的匹配，而不仅仅是第一个匹配。第一个示例：如果找到 ( if)匹配变量put中包含的匹配项$/，else例如n.a.。第二个示例：Raku 的三元运算符中使用的相同匹配条件，即condition ?? True !! False。因此，如果条件为??(True)，则put输出包含在$/match 变量中的匹配项，如果条件为!!False，则输出n.a.。

下面进行值检查：

raku -ne 'if m:g/ ( \d**1..3 <?{ $/ < 256 }> )**4 % "." / { $/.join(",").put } else {"n.a.".say};'

或者

raku -ne 'm:g/ ( \d**1..3 <?{ $/ < 256 }> )**4 % "." / ?? $/.join(",").put !! "n.a.".say;'

示例输入：同上

示例输出（两个示例）：

43.22.11.88,55.66.77.88
66.223.44.33
n.a.
1.2.33.3,1.32.2.4

上面显示的 Raku 代码检查每个 1 到 3 位的簇以确保它是小于的整数256。附加的正则表达式元素<?{ $/ < 256 }>是一个肯定断言，其中包含一个{...}代码块，用于检查$/match-variable 是否小于 256。参考这里。

https://raku.org

Answer

使用乐（以前称为 Perl_6）

没有值检查：

raku -ne 'if m:g/ ( \d**1..3 )**4 % "." / { $/.join(",").put } else {"n.a.".say};'

或者

raku -ne 'm:g/ ( \d**1..3 )**4 % "." / ?? $/.join(",").put !! "n.a.".say;'

输入示例：

blabla [11.335.2.33] xyuoeretrete [43.22.11.88] jfdfjkfbs [55.66.77.88]
blabla [66.223.44.33]
foo bar
blabla [1.2.33.3] xyuoeretrete [42] bla[1.32.2.4]

示例输出（两个示例）：

11.335.2.33,43.22.11.88,55.66.77.88
66.223.44.33
n.a.
1.2.33.3,1.32.2.4

至少在 Perl 系列语言中，您需要的是 a match，而不是 grep 。因此使用m/.../匹配运算符，使“全局”m:g/.../返回匹配的多个实例[这与 grep 不同，后者返回完整元素（例如行等），含有一场比赛]。

\d**1..3简而言之，搜索1 到 3 个数字的簇 ( )，这些数字**4重复 4 次，% "."每个实例之间有一个句点，并且全局搜索此正则表达式匹配 (m:global或m:g)，这意味着获取该数字的所有实例每个元素（行等）的匹配，而不仅仅是第一个匹配。第一个示例：如果找到 ( if)匹配变量put中包含的匹配项$/，else例如n.a.。第二个示例：Raku 的三元运算符中使用的相同匹配条件，即condition ?? True !! False。因此，如果条件为??(True)，则put输出包含在$/match 变量中的匹配项，如果条件为!!False，则输出n.a.。

下面进行值检查：

raku -ne 'if m:g/ ( \d**1..3 <?{ $/ < 256 }> )**4 % "." / { $/.join(",").put } else {"n.a.".say};'

或者

raku -ne 'm:g/ ( \d**1..3 <?{ $/ < 256 }> )**4 % "." / ?? $/.join(",").put !! "n.a.".say;'

示例输入：同上

示例输出（两个示例）：

43.22.11.88,55.66.77.88
66.223.44.33
n.a.
1.2.33.3,1.32.2.4

上面显示的 Raku 代码检查每个 1 到 3 位的簇以确保它是小于的整数256。附加的正则表达式元素<?{ $/ < 256 }>是一个肯定断言，其中包含一个{...}代码块，用于检查$/match-variable 是否小于 256。参考这里。

https://raku.org

如何为在一行中多次出现的正则表达式创建 grep

答案1

答案2

答案3

答案4

相关内容