lualatex 和破折号后的换行符

lualatex 和破折号后的换行符

我刚刚安装了 TeX Live 2012。如上所述别处在此网站上,发布历史记录解释说“对于 xetex 和 xelatex,参数 \XeTeXdashbreakstate 默认设置为 1。这允许在长破折号和短破折号后换行,这一直是纯 TEX、LATEX、LuaTEX 等的行为。”

我很少使用 xetex;自从 TeX Live 2011 发布以来,我几乎只使用 luatex。我唯一经常遇到的问题是不是在破折号后出现换行符,但现在都 2012 年了,我仍然无法得到它们,除非我不再使用 Unicode 字符 — — 这显然违背了 luatex 的精神。

请考虑这个例子,其中相同的段落被输入了 3 次,每次输入的破折号都不同:

\documentclass[11pt,a5paper]{book}
\usepackage{fontspec,microtype}
\setmainfont[Ligatures=TeX,RawFeature={protrusion=default}]{TeX Gyre Termes}
\pdfprotrudechars=2
\usepackage[showframe]{geometry}

% Will Robertson’s macro from https://tex.stackexchange.com/questions/34608/
\DeclareRobustCommand\dash{%
 \unskip\nobreak\thinspace\textemdash\allowbreak\thinspace\ignorespaces}

\pdfpagewidth=\paperwidth
\pdfpageheight=\paperheight
\pdfinfo{/Title (Jeeves Takes Charge) /Author (P.G. Wodehouse)}
\begin{document}
% using ---
Now, touching this business of old Jeeves---my man, you know---how do we
stand? Lots of people think I’m much too dependent on him. My Aunt
Agatha, in fact, has even gone so far as to call him my keeper. Well,
what I say is: Why not? The man’s a genius. From the collar upward he
stands alone.

% typing the em-dash (easy in utf-8 locale with compose key)
Now, touching this business of old Jeeves—my man, you know—how do we
stand? Lots of people think I’m much too dependent on him. My Aunt
Agatha, in fact, has even gone so far as to call him my keeper. Well,
what I say is: Why not? The man’s a genius. From the collar upward he
stands alone.

% using Will Robertson’s macro
Now, touching this business of old Jeeves\dash my man, you know\dash how do we
stand? Lots of people think I’m much too dependent on him. My Aunt
Agatha, in fact, has even gone so far as to call him my keeper. Well,
what I say is: Why not? The man’s a genius. From the collar upward he
stands alone.
\end{document}

结果如下:

使用 lualatex 编译上述示例的结果

我尝试过各种免费和商业字体,结果总是相同的。

是否有一些与 \XeTeXdashbreakstate 等效的 luatex 可以设置?

答案1

我不知道正确的答案是。UTF-8 破折号被视为单词的一部分。您会在日志文件中看到以下内容(当您省略 microtype 时):

Overfull \hbox (19.42857pt too wide) in paragraph at lines 28--33
[][]\EU2/TeXGyreTermes(0)/m/n/10.95 Now, touch-ing this busi-ness of old Jeeves
—my man, you know—how

我不确定 TeX 在遇到诸如 这样的字符时是否能够分解单词emdash。所以我的答案是“这是不可能的”。

你可以做的是,这有点像 hack:

\catcode`\—=13
\protected\def—{\unskip\nobreak\thinspace\textemdash\allowbreak\thinspace\ignorespaces}

这会使破折号处于活动状态并插入“可破坏的”破折号。

另一种可能性是在 LuaTeX 回调中过滤 emdash 并将其替换为可破坏的内容。

答案2

我认为从 unicode 的角度来看,这样的破折号后不应该自动出现断点。使用 unicode 时,您可能应该插入零宽度空格 (U+200B)。它不能在 lualatex 中开箱即用,您需要类似luaunicodespace(https://github.com/khaledhosny/luaunicodespace) 的东西:

\documentclass[]{scrbook}
\usepackage{fontspec}
\directlua {
require('luaunicodespace')
luatexbase.add_to_callback("pre_linebreak_filter", luaunicodespace.handler, "luaotfload", 1)
luatexbase.add_to_callback("hpack_filter",         luaunicodespace.handler, "luaotfload", 1)
}
\textwidth=3cm
\begin{document}
Text—Text—Text—Text—Text—Text—Text—Text

Text—^^^^200bText—^^^^200bText—^^^^200bText—^^^^200bText—^^^^200bText—^^^^200bText—^^^^200bText

\end{document}

答案3

请参阅这里的最后一段:http://www.heise.de/newsticker/meldung/TeX-Live-Jahrgang-2012-1634828.html,从 TeX 2012 开始,您必须将参数 \XeTeXdashbreakstate 明确设置为 null。

相关内容