如何在 Linux 中轻松地从标准输入流转换 HTML 特殊实体？

Question 1

PHP 非常适合此用途。此示例需要 PHP 5：

cat file.html | php -R 'echo html_entity_decode($argn);'

Answer

PHP 非常适合此用途。此示例需要 PHP 5：

cat file.html | php -R 'echo html_entity_decode($argn);'

Question 2

Perl 一如既往地是你的朋友。我认为这样做可以：

perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_) ;'

例如：

echo '"test" &amp; test $test ! test @ # $ % ^ &amp; *' |perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_) ;'

输出：

someguy@somehost ~]$ echo '"test" &amp; test $test ! test @ # $ % ^ &amp; *' |perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_) ;'
"test" & test $test ! test @ # $ % ^ & *

Answer

Perl 一如既往地是你的朋友。我认为这样做可以：

perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_) ;'

例如：

echo '"test" &amp; test $test ! test @ # $ % ^ &amp; *' |perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_) ;'

输出：

someguy@somehost ~]$ echo '"test" &amp; test $test ! test @ # $ % ^ &amp; *' |perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_) ;'
"test" & test $test ! test @ # $ % ^ & *

Question 3

重新编码似乎在主要 GNU/Linux 发行版的默认软件包存储库中可用。例如将 HTML 实体解码为 UTF-8：

…|recode html..utf8

编辑：初始仓库看起来没有维护，可以找到更新的分支这里。

Answer

重新编码似乎在主要 GNU/Linux 发行版的默认软件包存储库中可用。例如将 HTML 实体解码为 UTF-8：

…|recode html..utf8

编辑：初始仓库看起来没有维护，可以找到更新的分支这里。

Question 4

从标准输入获取文本文件：

#!/bin/bash
#
while read lin; do
  newl=${lin//&gt;/>}
  newl=${newl//&lt;/<}
  newl=${newl//&amp;/<}
  # ...other entites
  echo "$newl"
done

它可能需要 bash >= 版本 4

Answer

从标准输入获取文本文件：

#!/bin/bash
#
while read lin; do
  newl=${lin//&gt;/>}
  newl=${newl//&lt;/<}
  newl=${newl//&amp;/<}
  # ...other entites
  echo "$newl"
done

它可能需要 bash >= 版本 4

如何在 Linux 中轻松地从标准输入流转换 HTML 特殊实体？

答案1

答案2

答案3

答案4

相关内容