解析 Bash 脚本中的命令输出

Question 1

方法 1

这个纯 bash 脚本似乎符合你的需要

#!/usr/bin/env bash
declare id
declare name
declare url
declare version

while read line; do
  if [[ ! ${line} =~ ^[\+\| ]]; then
    if [[ ${line} =~ \|[[:space:]]*([[:digit:]]+)[[:space:]]*\|[[:space:]]+([[:alnum:]\.]+)[[:space:]]+\|[[:space:]]+(https?:\/\/(www\.)?[[:alnum:]]+\.[[:alpha:]]+\/?)[[:space:]]*\|[[:space:]]*([[:digit:]](\.[[:digit:]])?)[[:space:]]*\|  ]]; then
      id="${BASH_REMATCH[1]}"
      name="${BASH_REMATCH[2]}"
      url="${BASH_REMATCH[3]}"
      version="${BASH_REMATCH[5]}"
      echo "${id}:${name}:${url}:${version}"
    fi
  fi
done

方法 2

您也可以创建一个 bash 函数并在脚本中使用它，如下所示

#!/usr/bin/env bash
parse_result(){
  local id
  local name
  local url
  local version

  while read line; do
    if [[ ! ${line} =~ ^[\+\| ]]; then
      if [[ ${line} =~ \|[[:space:]]*([[:digit:]]+)[[:space:]]*\|[[:space:]]+([[:alnum:]\.]+)[[:space:]]+\|[[:space:]]+(https?:\/\/(www\.)?[[:alnum:]]+\.[[:alpha:]]+\/?)[[:space:]]*\|[[:space:]]*([[:digit:]](\.[[:digit:]])?)[[:space:]]*\|  ]]; then
        id="${BASH_REMATCH[1]}"
        name="${BASH_REMATCH[2]}"
        url="${BASH_REMATCH[3]}"
        version="${BASH_REMATCH[5]}"
        echo "${id}:${name}:${url}:${version}"
      fi
    fi
  done
}

parse_result < <(cat cmd.out)

在这里我使用流程替代但你可以使用管道

结果与讨论

例如 cmd.out 是要解析的命令输出。在你的例子中，你必须cat cmd.out用你的命令替换

结果 1：

$ cat cmd.out | ./app.bash
25:example.com:http://www.example.com/:3.8
34:anotherexample.com:https://anotherexample.com/:3.2
62:yetanotherexample.com:https://yetanotherexample.com/:3.9

结果 2：

$ bash app2.bash
25:example.com:http://www.example.com/:3.8
34:anotherexample.com:https://anotherexample.com/:3.2
62:yetanotherexample.com:https://yetanotherexample.com/:3.9

Answer

欢迎菲尔·考克森，

方法 1

这个纯 bash 脚本似乎符合你的需要

#!/usr/bin/env bash
declare id
declare name
declare url
declare version

while read line; do
  if [[ ! ${line} =~ ^[\+\| ]]; then
    if [[ ${line} =~ \|[[:space:]]*([[:digit:]]+)[[:space:]]*\|[[:space:]]+([[:alnum:]\.]+)[[:space:]]+\|[[:space:]]+(https?:\/\/(www\.)?[[:alnum:]]+\.[[:alpha:]]+\/?)[[:space:]]*\|[[:space:]]*([[:digit:]](\.[[:digit:]])?)[[:space:]]*\|  ]]; then
      id="${BASH_REMATCH[1]}"
      name="${BASH_REMATCH[2]}"
      url="${BASH_REMATCH[3]}"
      version="${BASH_REMATCH[5]}"
      echo "${id}:${name}:${url}:${version}"
    fi
  fi
done

方法 2

您也可以创建一个 bash 函数并在脚本中使用它，如下所示

#!/usr/bin/env bash
parse_result(){
  local id
  local name
  local url
  local version

  while read line; do
    if [[ ! ${line} =~ ^[\+\| ]]; then
      if [[ ${line} =~ \|[[:space:]]*([[:digit:]]+)[[:space:]]*\|[[:space:]]+([[:alnum:]\.]+)[[:space:]]+\|[[:space:]]+(https?:\/\/(www\.)?[[:alnum:]]+\.[[:alpha:]]+\/?)[[:space:]]*\|[[:space:]]*([[:digit:]](\.[[:digit:]])?)[[:space:]]*\|  ]]; then
        id="${BASH_REMATCH[1]}"
        name="${BASH_REMATCH[2]}"
        url="${BASH_REMATCH[3]}"
        version="${BASH_REMATCH[5]}"
        echo "${id}:${name}:${url}:${version}"
      fi
    fi
  done
}

parse_result < <(cat cmd.out)

在这里我使用流程替代但你可以使用管道

结果与讨论

例如 cmd.out 是要解析的命令输出。在你的例子中，你必须cat cmd.out用你的命令替换

结果 1：

$ cat cmd.out | ./app.bash
25:example.com:http://www.example.com/:3.8
34:anotherexample.com:https://anotherexample.com/:3.2
62:yetanotherexample.com:https://yetanotherexample.com/:3.9

结果 2：

$ bash app2.bash
25:example.com:http://www.example.com/:3.8
34:anotherexample.com:https://anotherexample.com/:3.2
62:yetanotherexample.com:https://yetanotherexample.com/:3.9

Question 2

非常感谢@bioinfornatics 和@jeff Schaller——我非常感谢你们提供的详细信息。

我在下面所示的解决方案中使用了你们两个的答案，其中 list_command 生成表格输出，process_command 针对每个网站 ID 运行。我已经测试过了，它运行正常 - 我只需要添加日志记录就可以了。

非常感谢你们俩！

#!/usr/bin/env bash
parse_result(){
  local id
  local name
  local url
  local version

  while read line; do

          # pull the id, name and url as variables starting from 4th line and ignoring lines starting with +---

          awk -F'|' ' NR > 3 && !/^+--/ { print $2, $3, $4, $5 } ' | while read id name url version

          do
            RESULT="$(process_command $id)"
            echo "result: $RESULT";
            echo "id: $id | name: $name | url: $url | version: $version";
          done
  done
}
parse_result < <(list_command)

Answer

非常感谢@bioinfornatics 和@jeff Schaller——我非常感谢你们提供的详细信息。

我在下面所示的解决方案中使用了你们两个的答案，其中 list_command 生成表格输出，process_command 针对每个网站 ID 运行。我已经测试过了，它运行正常 - 我只需要添加日志记录就可以了。

非常感谢你们俩！

#!/usr/bin/env bash
parse_result(){
  local id
  local name
  local url
  local version

  while read line; do

          # pull the id, name and url as variables starting from 4th line and ignoring lines starting with +---

          awk -F'|' ' NR > 3 && !/^+--/ { print $2, $3, $4, $5 } ' | while read id name url version

          do
            RESULT="$(process_command $id)"
            echo "result: $RESULT";
            echo "id: $id | name: $name | url: $url | version: $version";
          done
  done
}
parse_result < <(list_command)

Question 3

虽然你可以用 bash 仔细解析文本，但有时依赖专用的文本处理工具（如 awk）会更容易：

awk -F'|' ' NR > 3 && !/^+--/ { print $2, $3, $4} ' > log.txt

这告诉 awk 根据分隔符将行拆分为字段|；单引号内的程序代码分解如下：

NR > 3 &&-- 如果到目前为止处理的记录数（行数）大于 3 并且...
!/^+--/-- ... 如果该行确实不是从...开始+--
...然后是print字段 2、3 和 4

...最终全部重定向到log.txt文件。

Answer

虽然你可以用 bash 仔细解析文本，但有时依赖专用的文本处理工具（如 awk）会更容易：

awk -F'|' ' NR > 3 && !/^+--/ { print $2, $3, $4} ' > log.txt

这告诉 awk 根据分隔符将行拆分为字段|；单引号内的程序代码分解如下：

NR > 3 &&-- 如果到目前为止处理的记录数（行数）大于 3 并且...
!/^+--/-- ... 如果该行确实不是从...开始+--
...然后是print字段 2、3 和 4

...最终全部重定向到log.txt文件。

解析 Bash 脚本中的命令输出

答案1

答案2

答案3

相关内容