tr：将撇号转换为 ASCII

Question 1

您可以尝试其他工具，例如sed：

$ sed "s/’/'/g" <a
We're not a different species
“All alone?” Jeth mentioned.

或者，由于我们正在进行简单的翻译，因此使用y以下命令sed：

$ sed "y/’/'/" <a
We're not a different species
“All alone?” Jeth mentioned.

GNUtr不起作用大概是因为：

目前tr仅完全支持单字节字符。最终它将支持多字节字符；当它支持多字节字符时，该-C 选项将使其补充字符集，而-c 将使其补充值集。这种区别仅在某些值不是字符时才重要，并且只有在使用多字节编码的区域设置中输入包含编码错误时才有可能。

并且’是多字节字符：

$ echo -n \' | wc -c
1
$ echo -n ’ | wc -c  
3

Answer

您可以尝试其他工具，例如sed：

$ sed "s/’/'/g" <a
We're not a different species
“All alone?” Jeth mentioned.

或者，由于我们正在进行简单的翻译，因此使用y以下命令sed：

$ sed "y/’/'/" <a
We're not a different species
“All alone?” Jeth mentioned.

GNUtr不起作用大概是因为：

目前tr仅完全支持单字节字符。最终它将支持多字节字符；当它支持多字节字符时，该-C 选项将使其补充字符集，而-c 将使其补充值集。这种区别仅在某些值不是字符时才重要，并且只有在使用多字节编码的区域设置中输入包含编码错误时才有可能。

并且’是多字节字符：

$ echo -n \' | wc -c
1
$ echo -n ’ | wc -c  
3

Question 2

如果你还想转换双引号以及其他字符，你可以使用GNUiconv：

$ iconv -f utf-8 -t ascii//translit < a
We're not a different species
"All alone?" Jeth mentioned.

后缀//TRANSLIT表示iconv对于目标编码（此处为 ASCII）以外的字符，它可以自动替换相似的字符或序列。如果没有后缀，iconv一旦发现无法翻译的字符就会放弃。

请注意，这//TRANSLIT似乎是一个 GNU 扩展：POSIXiconv不支持。

Answer

如果你还想转换双引号以及其他字符，你可以使用GNUiconv：

$ iconv -f utf-8 -t ascii//translit < a
We're not a different species
"All alone?" Jeth mentioned.

后缀//TRANSLIT表示iconv对于目标编码（此处为 ASCII）以外的字符，它可以自动替换相似的字符或序列。如果没有后缀，iconv一旦发现无法翻译的字符就会放弃。

请注意，这//TRANSLIT似乎是一个 GNU 扩展：POSIXiconv不支持。

Question 3

您可以使用以下解决方案之一awk：

awk '{gsub(/\xE2\x80\x99/, "\x27");print}' file # with Hex ASCII code

awk '{gsub(/’/, "\x27");print}' file

awk '{gsub(/\342\200\231/, "\47");print}'  file # with Octal ASCII code

awk '{gsub(/’/, "\47");print}' file

或者

awk '{gsub(/’/, "'"'"'");print}' file

Answer

您可以使用以下解决方案之一awk：

awk '{gsub(/\xE2\x80\x99/, "\x27");print}' file # with Hex ASCII code

awk '{gsub(/’/, "\x27");print}' file

awk '{gsub(/\342\200\231/, "\47");print}'  file # with Octal ASCII code

awk '{gsub(/’/, "\47");print}' file

或者

awk '{gsub(/’/, "'"'"'");print}' file

Question 4

使用-s选项tr ：

$ echo "We’re not a different species"|tr -s "’" "'"
We're not a different species

从man tr ：

--truncate-set1
          first truncate SET1 to length of SET2

Answer

使用-s选项tr ：

$ echo "We’re not a different species"|tr -s "’" "'"
We're not a different species

从man tr ：

--truncate-set1
          first truncate SET1 to length of SET2

tr：将撇号转换为 ASCII

答案1

答案2

答案3

答案4

相关内容