如何使用shell脚本对数据输出进行分割和格式化?

如何使用shell脚本对数据输出进行分割和格式化?

我正在尝试将YAML文件转换为HTML表格,它涉及多个复杂的条件,我知道这可以使用shell脚本来完成,但是我在实现过程中遇到了一些问题,所以来到社区寻求帮助。

YAML内容格式如下。

- soft1:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft1_beta_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft1_alpha_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft2:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft2_beta_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft2_alpha_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip

< Omit more... >

它记录了多个软件的历史版本,我需要将其转换为HTML表格代码并单独输出到文件。

例如,将soft1soft1_beta_ver、输出soft1_alpha_ver到同一个文件(文件名使用soft1),将 soft2 输出到另一个文件。

需要转换的HTML表格的格式如下。

<table>
    <thead>
        <tr>
            <th>type</th>
            <th>ver</th>
            <th>link</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>soft1</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft1</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft1</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
    </tbody>
</table>

这是我正在尝试的shell脚本,我不知道如何将输出分割成多个文件,以及如何获取软件类型的变量。

#!/usr/bin/env bash

cat  << EOF
<table>
    <thead>
        <tr>
            <th>type</th>
            <th>ver</th>
            <th>link</th>
        </tr>
    </thead>
    <tbody>
EOF

while IFS=": " read -r softver softlink
do
cat << EOF
        <tr>
            <td>$softver</td>
            <td></td>
            <td><a href="$softlink">download</a></td>
        </tr>
EOF
done

cat << EOF
    </tbody>
</table>
EOF

对此的任何帮助或建议都会非常有帮助并且非常感激。

答案1

只需从原始输入文件生成您想要的 HTML:

$ cat ../tst.awk
/^-/ {
    sub(/:$/,"")
    out = type = $NF
    sub(/_.*/,"",out)
    close(out)
    if ( !seen[out]++ ) {
        prtBeg()
    }
    next
}
{
    sub(/:$/,"",$1)
    prtElt("<tr>")
    prtElt("<td>" type "</td>")
    prtElt("<td>" $1 "</td>")
    prtElt("<td>" $2 "</td>")
    prtElt("</tr>")
}
END {
    for (out in seen) {
        prtEnd()
    }
}

function prtElt(str) {
    depth[out] += gsub("<[^/<>]+>","&",str)
    printf "%*s%s\n", (depth[out]-1)*4, "", str >> out
    depth[out] -= gsub("</[^<>]+>","&",str)
}

function prtBeg() {
    prtElt("<table>")
    prtElt("<thead>")
    prtElt("<tr>")
    prtElt("<th>type</th>")
    prtElt("<th>ver</th>")
    prtElt("<th>link</th>")
    prtElt("</tr>")
    prtElt("</thead>")
    prtElt("<tbody>")
}

function prtEnd() {
    prtElt("</tbody>")
    prtElt("</table>")
}

$ ls
$
$ awk -f ../tst.awk ../file
$
$ ls
soft1  soft2

$ cat soft1
<table>
    <thead>
        <tr>
            <th>type</th>
            <th>ver</th>
            <th>link</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>soft1</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft1</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft1</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
    </tbody>
</table>

$ cat soft2
<table>
    <thead>
        <tr>
            <th>type</th>
            <th>ver</th>
            <th>link</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>soft2</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft2</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft2</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
        <tr>
            <td>soft2_beta_ver</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft2_beta_ver</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft2_beta_ver</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
        <tr>
            <td>soft2_alpha_ver</td>
            <td>V1.0.1</td>
            <td>http://example.com/v1.0.1.zip</td>
        </tr>
        <tr>
            <td>soft2_alpha_ver</td>
            <td>V1.0.2</td>
            <td>http://example.com/v1.0.2.zip</td>
        </tr>
        <tr>
            <td>soft2_alpha_ver</td>
            <td>V1.0.3</td>
            <td>http://example.com/v1.0.3.zip</td>
        </tr>
    </tbody>
</table>

以上是针对此输入文件运行的:

$ cat ../file
- soft1:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft1_beta_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft1_alpha_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft2:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft2_beta_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip
- soft2_alpha_ver:
    V1.0.1: http://example.com/v1.0.1.zip
    V1.0.2: http://example.com/v1.0.2.zip
    V1.0.3: http://example.com/v1.0.3.zip

答案2

类似这样的sed作品:

解析.sed

1r header

/^-/ {
  s/- //
  s/://
  h
}

G
s/ *([^:]+): ([^\n]+)\n(.*)/        <tr>\n            <td>\3<\/td>\n            <td>\1<\/td>\n            <td><a href="\2">Download<\/a><\/td>\n        <\/tr>/p

$r footer

在哪里标头页脚包含:

标头

<table>
    <thead>
        <tr>
            <th>type</th>
            <th>ver</th>
            <th>link</th>
        </tr>
    </thead>
    <tbody>

页脚

    </tbody>
</table>

像这样运行它:

sed -Enf parse.sed infile

输出分为 3 个部分导入文件:

<table>
    <thead>
        <tr>
            <th>type</th>
            <th>ver</th>
            <th>link</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>soft1</td>
            <td>V1.0.1</td>
            <td><a href="http://example.com/v1.0.1.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1</td>
            <td>V1.0.2</td>
            <td><a href="http://example.com/v1.0.2.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1</td>
            <td>V1.0.3</td>
            <td><a href="http://example.com/v1.0.3.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.1</td>
            <td><a href="http://example.com/v1.0.1.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.2</td>
            <td><a href="http://example.com/v1.0.2.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1_beta_ver</td>
            <td>V1.0.3</td>
            <td><a href="http://example.com/v1.0.3.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.1</td>
            <td><a href="http://example.com/v1.0.1.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.2</td>
            <td><a href="http://example.com/v1.0.2.zip">Download</a></td>
        </tr>
        <tr>
            <td>soft1_alpha_ver</td>
            <td>V1.0.3</td>
            <td><a href="http://example.com/v1.0.3.zip">Download</a></td>
        </tr>
    </tbody>
</table>

答案3

您需要区分该行何时是标题,例如通过“读取”每行上的 3 个变量:

while IFS=": " read -r a b c
do
    if [[ "$a" == "-" ]]; then
        t=$b
    else
        cat << EOF
        <tr>
            <td>$t</td>
            <td>$a</td>
            <td><a href="$b:$c">download</a></td>
        </tr>
EOF
    fi
done

相关内容