德语复合词中连字首选项设置指南

德语复合词中连字首选项设置指南

假设我们希望为德语长复合词提供不同的连字点优先级。以下是我们目前尝试的方法:

\documentclass[ngerman]{article}
\usepackage{iftex}
\ifTUTeX
\else
\usepackage[T1]{fontenc}
\fi
\usepackage[ngerman]{babel}
\babelprovide[hyphenrules=ngerman-x-latest]{ngerman}
\ifluatex
  %\hyphenpenalty is 50 by default
  \exceptionpenalty=51%%% ad-hoc value greater than \hyphenpenalty.
  \exhyphenpenalty=46%%% an ad-hoc value less than \hyphenpenalty (though usually they coincide). We choose the value such that it's one less than the minimum over the hyphenation penalties of the explicitly specified compounds.
  % \babelhyphenation[ngerman]{
  %   In{-}{}{}[3]ter{-}{}{}[1]ak{-}{}{}[2]ti{-}{}{}[4]ons-dia{-}{}{}[2]gram{-}{}{}[3]me
  %   In{-}{}{}[3]ter{-}{}{}[1]ak{-}{}{}[2]ti{-}{}{}[4]ons-ver{-}{}{}[1]fei{-}{}{}[3]ne{-}{}{}[3]rung
  %   In{-}{}{}[4]ter{-}{}{}[2]ak{-}{}{}[3]ti{-}{}{}[5]ons{-}{}{}[1]ver{-}{}{}[2]fei{-}{}{}[4]ne{-}{}{}[4]rungs-dia{-}{}{}[3]gramm
  % }%%% this way we get too large penalties.  Breaking the words specified this way is discouraged (and thus, breaking the unspecified words is implicitly encouraged).
  \babelposthyphenation{ngerman}{In|ter|ak|ti|ons|dia|gram|me}%% set potentially different preferences of the hyphenation points in “Interaktionsdiagramme”
  {
    {}, {},         % In
    {pre=-, penalty=51, data=2},
    {}, {}, {},     % ter
    {pre=-, penalty=49, data=6},
    {}, {},         % ak
    {pre=-, penalty=50, data=9},
    {}, {},         % ti
    {pre=-, penalty=52, data=12}, %%% higher penalty because of the pronunciation t͡si̯oː
    {}, {}, {},     % ons
    {pre=-, penalty=48, data=16},
    {}, {}, {},     % dia
    {pre=-, penalty=50, data=20},
    {}, {}, {}, {}, % gram
    {pre=-, penalty=51, data=25},
    {}, {}          % me
  }% the arithmetic mean of the penalties is ≈ 50.14.
  \babelposthyphenation{ngerman}{In|ter|ak|ti|ons|ver|fei|ne|rung}%% set potentially different preferences of the hyphenation points in “Interaktionsverfeinerung”
  {
    {}, {},        % In
    {pre=-, penalty=51, data=2},
    {}, {}, {},    % ter
    {pre=-, penalty=49, data=6},
    {}, {},        % ak
    {pre=-, penalty=50, data=9},
    {}, {},        % ti
    {pre=-, penalty=52, data=12}, %%% higher penalty because of the pronunciation t͡si̯oː
    {}, {}, {},    % ons
    {pre=-, penalty=48, data=16},
    {}, {}, {},    % ver
    {pre=-, penalty=49, data=20},
    {}, {}, {},    % fei
    {pre=-, penalty=51, data=24},
    {}, {},        % ne
    {pre=-, penalty=51, data=27},
    {}, {}, {}, {} % rung
  }% the arithmetic mean of the penalties is 50.125.
  \babelposthyphenation{ngerman}{In|ter|ak|ti|ons|ver|fei|ne|rungs|dia|gramm}%% set potentially different preferences of the hyphenation points in “Interaktionsverfeinerungsdiagramm”
  {
    {}, {},             % In
    {pre=-, penalty=51, data=2},
    {}, {}, {},         % ter
    {pre=-, penalty=49, data=6},
    {}, {},             % ak
    {pre=-, penalty=50, data=9},
    {}, {},             % ti
    {pre=-, penalty=52, data=12}, %%% higher penalty because of the pronunciation t͡si̯oː
    {}, {}, {},         % ons
    {pre=-, penalty=48, data=16},
    {}, {}, {},         % ver
    {pre=-, penalty=49, data=20},
    {}, {}, {},         % fei
    {pre=-, penalty=51, data=24},
    {}, {},             % ne
    {pre=-, penalty=51, data=27},
    {}, {}, {}, {}, {}, % rungs
    {pre=-, penalty=47, data=33},
    {}, {}, {},         % dia
    {pre=-, penalty=50, data=37},
    {}, {}, {}, {}, {}  % gramm
  }%the arithmetic mean of the penalties is 49.8.
\fi
\showoutput
\begin{document}
Interaktionsdiagramme
Interaktionsverfeinerung
Interaktionsverfeinerungsdiagramm
\end{document}

此示例仅供演示,因此比较小。在现实世界中,完整的示例中,我们目前有 77 个\babelposthyphenation[…]{…}{…}条目,每个单词一个条目。结果非常棒:这些单词\hyphenpenalty在控制台输出中获得了标准的连字符惩罚,并且我们指定了如何拆分复合词的有意义的首选项。

我们做出了一些不太确定的一般选择:

  1. 我们将\exhyphenpenalty所有明确指定的惩罚设置为最小值 - 1。这样做的理由是,首选的断行点始终是在复合词中的破折号之后,但没有必要在不必要的情况下鼓励在这样的破折号之后换行。

  2. 每个单词的惩罚值尽量接近\hyphenpenalty平均值(默认为 50)。为了衡量“平均而言尽可能接近”,我们任意选择计算算术平均值(只是因为这种计算很简单,没有其他原因)。其他选择可以是取几何平均值、对数平均值、二次平均值、调和平均值、中位数或其他某种平均值。我们的目标是:对于通过 明确指定的每个单词\babelposthyphenation,LuaLaTeX 应该尝试移动单词内断词的位置,而不是鼓励或阻止断掉整个指定的单词,或者鼓励或阻止断掉包含指定单词的段落中的其他单词(与没有指定且所有惩罚都是标准的 的情况相比\hyphenpenalty)。当然,我们知道这是一个不精确的、可能是理想的、无法实现的目标。

我们是否需要总体调整这两个选择?如果是的话:

  • 为什么

  • 如何?

当然,这些选择只是一种指导方针,一种经验法则,我们打算在一般情况下遵守。(毫无疑问,我们可能会假设自己在未来因未知的特殊、特定词语而偏离这样的指导方针,原因尚不清楚。)

相关内容