更新，2016 年 4 月

Question 1

简单的答案是，utf8x如果可能的话，应该避免这样做。它会加载ucs很长时间无人维护的包（尽管现在有一个新的维护者），并且会破坏其他各种东西。

看egreg 对此问题的回答以及，其中概述了如何使用[utf8]选项获取额外的字符inputenc。

然而，一般来说，处理 Unicode 源（尤其是非拉丁文字）的最佳方法实际上是 XeLaTeX 或 LuaLaTeX。

这里有一个对此的扩展讨论：编码备注。请特别参阅 Philipp Lehman 和 Philipp Stephani 的评论。

Answer

简单的答案是，utf8x如果可能的话，应该避免这样做。它会加载ucs很长时间无人维护的包（尽管现在有一个新的维护者），并且会破坏其他各种东西。

看egreg 对此问题的回答以及，其中概述了如何使用[utf8]选项获取额外的字符inputenc。

然而，一般来说，处理 Unicode 源（尤其是非拉丁文字）的最佳方法实际上是 XeLaTeX 或 LuaLaTeX。

这里有一个对此的扩展讨论：编码备注。请特别参阅 Philipp Lehman 和 Philipp Stephani 的评论。

Question 2

事实上，utf8可能并不像看起来那么严格：它只加载字体编码可以显示的字符。

打字时

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}

OT1加载时字体编码仍然为inputenc，字符很少。通过使用

\usepackage[T1]{fontenc} 
\usepackage[utf8]{inputenc}

您将允许所有可显示的 utf8 字符作为输入。

Answer

事实上，utf8可能并不像看起来那么严格：它只加载字体编码可以显示的字符。

打字时

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}

OT1加载时字体编码仍然为inputenc，字符很少。通过使用

\usepackage[T1]{fontenc} 
\usepackage[utf8]{inputenc}

您将允许所有可显示的 utf8 字符作为输入。

Question 3

不要使用utf8x；使用最新的 TeX 发行版，它只会对其最模糊的功能（例如，使用来自网络的图像伪造字符）显示必要。

utf8x希腊语的问题可能是采用而不是的主要原因utf8，但这个问题已经得到解决，

\documentclass{article}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[polutonikogreek,english]{babel}

\begin{document}

This is english
\textgreek{Τηις ις γρεεκ}
This is english again.

\end{document}

很乐意打印

在此处输入图片描述

偶尔缺失的定义可以用简单的方法处理。如果你能够输入 Unicode 字符，例如威尔士字母

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

或带有韵律标记的拉丁元音

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ Āā Ēē Īī Ōō Ūū Ȳȳ

（Unicode 中缺少带有短音符的 y，而带有短音符的 a 已经定义，utf8因为它是罗马尼亚语中的字母），您可以简单地将未知字符添加到已知字符列表中：

\documentclass{article}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{newunicodechar}

% missing Welsh coverage
\newunicodechar{Ŵ}{\^W}
\newunicodechar{ŵ}{\^w}
\newunicodechar{Ŷ}{\^Y}
\newunicodechar{ŷ}{\^y}

% Latin vowels with prosodic marks    
\newunicodechar{Ĕ}{\u{E}}
\newunicodechar{ĕ}{\u{e}}
\newunicodechar{Ĭ}{\u{I}}
\newunicodechar{ĭ}{\u{\i}}
\newunicodechar{Ŏ}{\u{O}}
\newunicodechar{ŏ}{\u{o}}
\newunicodechar{Ŭ}{\u{U}}
\newunicodechar{ŭ}{\u{u}}
\newunicodechar{Ā}{\=A}
\newunicodechar{ā}{\=a}
\newunicodechar{Ē}{\=E}
\newunicodechar{ē}{\=e}
\newunicodechar{Ī}{\=I}
\newunicodechar{ī}{\={\i}}
\newunicodechar{Ō}{\=O}
\newunicodechar{ō}{\=o}
\newunicodechar{Ū}{\=U}
\newunicodechar{ū}{\=u}
\newunicodechar{Ȳ}{\=Y}
\newunicodechar{ȳ}{\=y}

\begin{document}

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ

Āā Ēē Īī Ōō Ūū Ȳȳ

\end{document}

在此处输入图片描述

例如，请注意以下行

\newunicodechar{Ŵ}{\^W}

也可以输入为

\DeclareUnicodeCharacter{0174}{\^W}

不需要newunicodechar包，因为U+0174是带有卷标的拉丁大写字母 W 的代码点；但\newunicodechar无需在 Unicode 表中查找。

更新，2016 年 4 月

使用最新的 LaTeX 内核，上述定义几乎都不再必要，因为T1enc.dfu已经更新和丰富。在上一个示例中的重音字母中，只有Ȳ和ȳ需要定义（它们可能会包含在下一个版本中）。

\documentclass{article}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{newunicodechar}

\newunicodechar{Ȳ}{\=Y}
\newunicodechar{ȳ}{\=y}

\begin{document}

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ

Āā Ēē Īī Ōō Ūū Ȳȳ

\end{document}

2021 年更新

现在，所有这些重音字母都在内核中定义好了。因此，下面的操作开箱即用。请注意，这\usepackage[T1]{fontenc}并不是严格要求的；但是，最好加载它，因为它包含许多预制的重音字母。

\documentclass{article}

\usepackage[T1]{fontenc}

\begin{document}

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ

Āā Ēē Īī Ōō Ūū Ȳȳ

\end{document}

Answer

不要使用utf8x；使用最新的 TeX 发行版，它只会对其最模糊的功能（例如，使用来自网络的图像伪造字符）显示必要。

utf8x希腊语的问题可能是采用而不是的主要原因utf8，但这个问题已经得到解决，

\documentclass{article}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[polutonikogreek,english]{babel}

\begin{document}

This is english
\textgreek{Τηις ις γρεεκ}
This is english again.

\end{document}

很乐意打印

在此处输入图片描述

偶尔缺失的定义可以用简单的方法处理。如果你能够输入 Unicode 字符，例如威尔士字母

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

或带有韵律标记的拉丁元音

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ Āā Ēē Īī Ōō Ūū Ȳȳ

（Unicode 中缺少带有短音符的 y，而带有短音符的 a 已经定义，utf8因为它是罗马尼亚语中的字母），您可以简单地将未知字符添加到已知字符列表中：

\documentclass{article}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{newunicodechar}

% missing Welsh coverage
\newunicodechar{Ŵ}{\^W}
\newunicodechar{ŵ}{\^w}
\newunicodechar{Ŷ}{\^Y}
\newunicodechar{ŷ}{\^y}

% Latin vowels with prosodic marks    
\newunicodechar{Ĕ}{\u{E}}
\newunicodechar{ĕ}{\u{e}}
\newunicodechar{Ĭ}{\u{I}}
\newunicodechar{ĭ}{\u{\i}}
\newunicodechar{Ŏ}{\u{O}}
\newunicodechar{ŏ}{\u{o}}
\newunicodechar{Ŭ}{\u{U}}
\newunicodechar{ŭ}{\u{u}}
\newunicodechar{Ā}{\=A}
\newunicodechar{ā}{\=a}
\newunicodechar{Ē}{\=E}
\newunicodechar{ē}{\=e}
\newunicodechar{Ī}{\=I}
\newunicodechar{ī}{\={\i}}
\newunicodechar{Ō}{\=O}
\newunicodechar{ō}{\=o}
\newunicodechar{Ū}{\=U}
\newunicodechar{ū}{\=u}
\newunicodechar{Ȳ}{\=Y}
\newunicodechar{ȳ}{\=y}

\begin{document}

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ

Āā Ēē Īī Ōō Ūū Ȳȳ

\end{document}

在此处输入图片描述

例如，请注意以下行

\newunicodechar{Ŵ}{\^W}

也可以输入为

\DeclareUnicodeCharacter{0174}{\^W}

不需要newunicodechar包，因为U+0174是带有卷标的拉丁大写字母 W 的代码点；但\newunicodechar无需在 Unicode 表中查找。

更新，2016 年 4 月

使用最新的 LaTeX 内核，上述定义几乎都不再必要，因为T1enc.dfu已经更新和丰富。在上一个示例中的重音字母中，只有Ȳ和ȳ需要定义（它们可能会包含在下一个版本中）。

\documentclass{article}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{newunicodechar}

\newunicodechar{Ȳ}{\=Y}
\newunicodechar{ȳ}{\=y}

\begin{document}

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ

Āā Ēē Īī Ōō Ūū Ȳȳ

\end{document}

2021 年更新

现在，所有这些重音字母都在内核中定义好了。因此，下面的操作开箱即用。请注意，这\usepackage[T1]{fontenc}并不是严格要求的；但是，最好加载它，因为它包含许多预制的重音字母。

\documentclass{article}

\usepackage[T1]{fontenc}

\begin{document}

Ââ Êê Îî Ôô Ŵŵ Ŷŷ Ïï

Ăă Ĕĕ Ĭĭ Ŏŏ Ŭŭ

Āā Ēē Īī Ōō Ūū Ȳȳ

\end{document}

Question 4

我曾经有过无法使用编译希伯来语的经历utf8，只能utf8x使用 MikTeX（例如 2.9）中的 pdflatex 来编译希伯来语。许多关于编写希伯来语 LaTeX 的指南建议使用utf8x：

这并不是要与上述博学圣贤的说法相矛盾，这只是一个似乎无法避免的案例（除非有人提出像 Ulrike 针对希腊语所建议的那样的方法）。

笔记：此答案仅与 pdfTeX+Babel 相关，与 XeTeX+Polyglossia 无关。

Answer

我曾经有过无法使用编译希伯来语的经历utf8，只能utf8x使用 MikTeX（例如 2.9）中的 pdflatex 来编译希伯来语。许多关于编写希伯来语 LaTeX 的指南建议使用utf8x：

这并不是要与上述博学圣贤的说法相矛盾，这只是一个似乎无法避免的案例（除非有人提出像 Ulrike 针对希腊语所建议的那样的方法）。

笔记：此答案仅与 pdfTeX+Babel 相关，与 XeTeX+Polyglossia 无关。

更新，2016 年 4 月

答案1

答案2

答案3

更新，2016 年 4 月

2021 年更新

答案4

相关内容