假设我们希望为德语长复合词提供不同的连字点优先级。以下是我们目前尝试的方法:
\documentclass[ngerman]{article}
\usepackage{iftex}
\ifTUTeX
\else
\usepackage[T1]{fontenc}
\fi
\usepackage[ngerman]{babel}
\babelprovide[hyphenrules=ngerman-x-latest]{ngerman}
\ifluatex
%\hyphenpenalty is 50 by default
\exceptionpenalty=51%%% ad-hoc value greater than \hyphenpenalty.
\exhyphenpenalty=46%%% an ad-hoc value less than \hyphenpenalty (though usually they coincide). We choose the value such that it's one less than the minimum over the hyphenation penalties of the explicitly specified compounds.
% \babelhyphenation[ngerman]{
% In{-}{}{}[3]ter{-}{}{}[1]ak{-}{}{}[2]ti{-}{}{}[4]ons-dia{-}{}{}[2]gram{-}{}{}[3]me
% In{-}{}{}[3]ter{-}{}{}[1]ak{-}{}{}[2]ti{-}{}{}[4]ons-ver{-}{}{}[1]fei{-}{}{}[3]ne{-}{}{}[3]rung
% In{-}{}{}[4]ter{-}{}{}[2]ak{-}{}{}[3]ti{-}{}{}[5]ons{-}{}{}[1]ver{-}{}{}[2]fei{-}{}{}[4]ne{-}{}{}[4]rungs-dia{-}{}{}[3]gramm
% }%%% this way we get too large penalties. Breaking the words specified this way is discouraged (and thus, breaking the unspecified words is implicitly encouraged).
\babelposthyphenation{ngerman}{In|ter|ak|ti|ons|dia|gram|me}%% set potentially different preferences of the hyphenation points in “Interaktionsdiagramme”
{
{}, {}, % In
{pre=-, penalty=51, data=2},
{}, {}, {}, % ter
{pre=-, penalty=49, data=6},
{}, {}, % ak
{pre=-, penalty=50, data=9},
{}, {}, % ti
{pre=-, penalty=52, data=12}, %%% higher penalty because of the pronunciation t͡si̯oː
{}, {}, {}, % ons
{pre=-, penalty=48, data=16},
{}, {}, {}, % dia
{pre=-, penalty=50, data=20},
{}, {}, {}, {}, % gram
{pre=-, penalty=51, data=25},
{}, {} % me
}% the arithmetic mean of the penalties is ≈ 50.14.
\babelposthyphenation{ngerman}{In|ter|ak|ti|ons|ver|fei|ne|rung}%% set potentially different preferences of the hyphenation points in “Interaktionsverfeinerung”
{
{}, {}, % In
{pre=-, penalty=51, data=2},
{}, {}, {}, % ter
{pre=-, penalty=49, data=6},
{}, {}, % ak
{pre=-, penalty=50, data=9},
{}, {}, % ti
{pre=-, penalty=52, data=12}, %%% higher penalty because of the pronunciation t͡si̯oː
{}, {}, {}, % ons
{pre=-, penalty=48, data=16},
{}, {}, {}, % ver
{pre=-, penalty=49, data=20},
{}, {}, {}, % fei
{pre=-, penalty=51, data=24},
{}, {}, % ne
{pre=-, penalty=51, data=27},
{}, {}, {}, {} % rung
}% the arithmetic mean of the penalties is 50.125.
\babelposthyphenation{ngerman}{In|ter|ak|ti|ons|ver|fei|ne|rungs|dia|gramm}%% set potentially different preferences of the hyphenation points in “Interaktionsverfeinerungsdiagramm”
{
{}, {}, % In
{pre=-, penalty=51, data=2},
{}, {}, {}, % ter
{pre=-, penalty=49, data=6},
{}, {}, % ak
{pre=-, penalty=50, data=9},
{}, {}, % ti
{pre=-, penalty=52, data=12}, %%% higher penalty because of the pronunciation t͡si̯oː
{}, {}, {}, % ons
{pre=-, penalty=48, data=16},
{}, {}, {}, % ver
{pre=-, penalty=49, data=20},
{}, {}, {}, % fei
{pre=-, penalty=51, data=24},
{}, {}, % ne
{pre=-, penalty=51, data=27},
{}, {}, {}, {}, {}, % rungs
{pre=-, penalty=47, data=33},
{}, {}, {}, % dia
{pre=-, penalty=50, data=37},
{}, {}, {}, {}, {} % gramm
}%the arithmetic mean of the penalties is 49.8.
\fi
\showoutput
\begin{document}
Interaktionsdiagramme
Interaktionsverfeinerung
Interaktionsverfeinerungsdiagramm
\end{document}
此示例仅供演示,因此比较小。在现实世界中,完整的示例中,我们目前有 77 个\babelposthyphenation[…]{…}{…}
条目,每个单词一个条目。结果非常棒:这些单词\hyphenpenalty
在控制台输出中获得了标准的连字符惩罚,并且我们指定了如何拆分复合词的有意义的首选项。
我们做出了一些不太确定的一般选择:
我们将
\exhyphenpenalty
所有明确指定的惩罚设置为最小值 - 1。这样做的理由是,首选的断行点始终是在复合词中的破折号之后,但没有必要在不必要的情况下鼓励在这样的破折号之后换行。每个单词的惩罚值尽量接近
\hyphenpenalty
平均值(默认为 50)。为了衡量“平均而言尽可能接近”,我们任意选择计算算术平均值(只是因为这种计算很简单,没有其他原因)。其他选择可以是取几何平均值、对数平均值、二次平均值、调和平均值、中位数或其他某种平均值。我们的目标是:对于通过 明确指定的每个单词\babelposthyphenation
,LuaLaTeX 应该尝试移动单词内断词的位置,而不是鼓励或阻止断掉整个指定的单词,或者鼓励或阻止断掉包含指定单词的段落中的其他单词(与没有指定且所有惩罚都是标准的 的情况相比\hyphenpenalty
)。当然,我们知道这是一个不精确的、可能是理想的、无法实现的目标。
我们是否需要总体调整这两个选择?如果是的话:
为什么
如何?
当然,这些选择只是一种指导方针,一种经验法则,我们打算在一般情况下遵守。(毫无疑问,我们可能会假设自己在未来因未知的特殊、特定词语而偏离这样的指导方针,原因尚不清楚。)