避免在句子的第一个单词后分页

避免在句子的第一个单词后分页

有没有办法告诉 TeX 避免在句子的第一个单词之后分页?

……他
已经
死了。显然他的脖子被
折断了。闪电第三次闪过,
他的脸扑向了我。我跳了起来。它
(下页继续)

然后你必须翻页查看句子的剩余部分。它不单独成一行,因此不能像孤行一样受到惩罚。

能否告诉 TeX 通过在第一个单词之前分页来解决这些问题?

笔记:

虽然下面给出的答案非常有用,但普遍的共识是,最好的做法是让校对员发现这个问题,然后手动修复。

答案1

编辑:我忘了说,虽然这个答案在简单情况下有效,但依赖它解决任何严重问题都是坏主意,因为它可能会以多种不同的方式中断。通常,更改 catcode 是个坏主意……

编辑:Lev Bishop 指出,\nopagebreak在每个句子的第一个单词后插入太多了,因为这将禁止在包含句子的第一个单词的每一行后换行。在这里,我通过使用辅助文件并检查句子第一个单词后面空格两侧的页码来修复此问题。

也可以使.、、!活跃?起来,让它们读出下一个单词并放在\nopagebreak每个句子的第一个单词后面(段落的第一个单词除外)。

如果我们仍想使用.in 维度(例如width=3.4cmin \includegraphics),事情会变得更加复杂。此外,段落的最后一个标点符号需要特殊处理(特别是当段落没有完全以该标点符号结尾时(例如引号)...)。

希望下面的代码可以正常工作。目前,我*在 之后插入了\nopagebreak,只是为了直观地看到 插入的位置\nopagebreak。当然,将其删除。

\documentclass[a5paper]{article}


\makeatletter
% \begin{macro}
% The code below inserts "\eos@text" each time a space following the
% first word of the sentence falls on the separation between two pages.
%    \begin{macrocode}
\newcommand{\eos@text}{\nopagebreak[4]*}
%    \end{macrocode}
% \end{macro}
% 
% \begin{macro}{\eos@active,\eos@active@text}
%   
%   "#1" is the character (".", "!", "?") that ended the sentence.
%   We distinguih various cases depending on the following non-space 
%   character, "#2". In every case, we start by putting the
%   punctuation "#1" back.
%   
%   If "#2" is a digit, we assume that we are in the middle of a
%   number such as "width=5.3em" in, say, "\includegraphics".
%   (This is only relevant for ".", though.)
%   
%   If "#2" is "\par", that means that the punctuation is the last 
%   one in the paragraph, so we can safely do nothing.
%   
%   If "#2" is a quote, we need to treat things differently. (Here
%   we actually pretend that the quote is in fact the end-of-sentence.)
%   
%   Finally, in every other case, we grab the first word and place
%   a non-breakable space afterwards.
%   
%   In each case, we put back what directly followed the punctuation
%   right after our test.
% 
%   \begin{macrocode}
\newcommand{\eos@active}[2]{%
  #1%
  \ifnum9<1#2\space 
  \else
    \ifx\par#2%
    \else
      \ifx'#2%
        \expandafter\expandafter\expandafter\expandafter
        \expandafter\expandafter\expandafter\eos@active
      \else
        \expandafter\expandafter\expandafter\expandafter
        \expandafter\expandafter\expandafter\eos@active@text
      \fi
    \fi
  \fi
  #2%
}
%    \end{macrocode}
%    Grabbing the following word: the first "\newcommand" checks that
%    the command is not already defined. Then we define it through "\def"
%    because its argument is a bit more complicated than usual, delimited
%    by a space. Also to note is the initial space (before "#1"): that 
%    was lost in our test, and we put it back.
%    
%    Earlier, we were putting a "\nopagebreak" after that first word,
%    but now, we do something more tricky, only putting a "\nopagebreak"
%    if at the previous run of LaTeX there was a page break there.
%    
%    \begin{macrocode}
\newcommand{\eos@active@text}{}
\def\eos@active@text#1 { #1\eos@space}
%    \end{macrocode}
% \end{macro}

% \begin{macro}{\eos@space}
%   As Lev Bishop mentions, putting "\nopagebreak" forbids a page break
%   after the current line. So we don't want to always insert a page 
%   break! The test \emph{is} crazy\ldots Too lazy to explain the 
%   details. "\count0" is the page number, "\write" rather than 
%   "\immediate\write" in order to get the page number when typeset
%   rather than when read. "\csname eos@mark@\the\count0\endcsname"
%   creates a control sequence (equal to relax) corresponding to the
%   page number. And the test "\eos@pagetest" checks whether the
%   control sequence corresponding to the page \emph{after} the space
%   is already defined. If it is, we write something to the aux file.
%   
%   
%   \begin{macrocode}
\newcount\eos@current
\newcount\eos@pageno
\newcommand{\eos@space}{%
  \advance\eos@current by\@ne
  \write\@mainaux{\relax
    \expandafter\@gobble\csname eos@mark@\the\count0\endcsname}%
  \csname eos@\romannumeral\eos@current\endcsname
  \space
  \write\expandafter\@mainaux\expandafter{%
    \expandafter\eos@pagetest\expandafter{\romannumeral\eos@current}\relax}%
}
%   \end{macrocode}
% \end{macro}
% 
% \begin{macro}{\eos@pagetest}
%   If the page number is a brand new page number (i.e. if 
%   "\csname eos@mark@\the\count0\endcsname" is not yet defined),
%   we write something to the aux file. Otherwise, we don't do anything.
%    \begin{macrocode}
\newcommand{\eos@pagetest}[1]{%
  \unless\ifcsname eos@mark@\the\count0\endcsname
  \noexpand\eos@rewrite{\gdef\csname eos@#1\endcsname{\noexpand\eos@text}}%
  \fi
}%
\newcommand{\eos@rewrite}[1]{#1%
  \ifx\usepackage\documentclass
  \expandafter\@gobble
  \else
  \expandafter\AtBeginDocument
  \fi
  {\immediate\write\@mainaux{\unexpanded{\eos@rewrite{#1}}}}%
}
%    \end{macrocode}
%    "\eos@rewrite" is meant for use in the aux file, and 
%    rewrites itself to the aux file. The test is very bad, 
%    checks whether we are reading the aux file at the start
%    or the end of the document (any better test?).
%
%    If we didn't rewrite, a space that changes page would lead
%    to inserting "\nopagebreak[4]", but at the next run that would
%    prevent a page break. Then the space would no longer be at the
%    change of a page. So it would not insert "\nopagebreak[4]" for
%    the next run. Thus, in the next run, the space would (probably)
%    be at the change of pages again, etc. 
%    
%    So we make that "\nopagebreak" resilient. If you need to reset 
%    all of this, just delete the .aux file.
% \end{macro}
%  

% \begin{macro}{\activate@eos}
%     It's better to make ".", "!", "?" at "\begin{document}".
%     For that we define "\activate@eos" which makes its 
%     argument active, and defines it to be an end-of-sentence ("eos").
%     \begin{macrocode}
\newcommand{\activate@eos}[1]{%
  \begingroup
  \lccode`\~`#1\space
  \lowercase{%
    \endgroup
    \catcode`#1=13\relax
    \newcommand{~}{\eos@active{#1}}%
  }%
}
\AtBeginDocument{%
  \activate@eos{.}%
  \activate@eos{!}%
  \activate@eos{?}%
}
%     \end{macrocode}
%   Am I missing a possible end-of-sentence marker?
% \end{document}
\makeatother


% ==========================================================
% Just for demonstration
\usepackage[text={5cm,36pt}]{geometry}


\begin{document}

% We repeat the text until it fills 10 pages.
\loop\ifnum\count0<10\relax 
Greetings. He will. Will he? No, he won't. Maybe not, anyways. Although, perhaps. And that changes. Constantly. Is it worth? It really is not. Short sentences, why? To test better. Make sure it works!

I'm lazy. So copy. And paste. Repeating the same. Many times. Of course! Just a bit more.

\repeat

\end{document}

答案2

(这实际上不是一个答案,但是对于评论来说它太长了)。

已经给出了几个答案,使用各种方法在句子的第一个单词后插入\nopagebreak。不幸的是,这些方法都不太有效,因为它们都会禁止过多的分页符并同时不禁止所有所需的分页符

要证明前者是正确的,我们需要认识到\nopagebreak实际上等同于\vadjust{\penalty10000}。换句话说,这将试图禁止在包含句子第一个单词的行之后的任何分页符(即,不仅仅是句子的第一个单词是该行的最后一个单词的行)。

要证明后者是正确的,我们需要意识到 TeX 会在特定情况下自行将各种惩罚插入垂直列表。以下列表可能不完整:,,,,,,,。\interlinepenalty如果出现其中一种情况,且相应的惩罚小于 10000,则仍可能会中断(当有 2 个或更多连续的惩罚时,TeX 可以在其中任何一个它认为方便的情况下中断)。\clubpenalty\widowpenalty\brokenpenalty\displaywidowpenalty\predisplaypenalty\postdisplaypenalty

通过使用特殊标记值作为惩罚,然后由输出例程进行特殊处理,很可能可以实现您想要的效果(类似于纯文本\supereject设置惩罚为 -20000,然后由其输出例程进行特殊处理)。使用 LuaTeX 可能也可以做到这一点,但目前给出的解决方案无法实现您想要的效果(至少,不可靠)。

答案3

一种可行的但非 TeX 的解决方案可能是对sed文本源这样:

sed 's/\([\!?\."'"'"'] \+\)\([^ ]\+\) \+/\1\2\\nopagebreak\\ /g' file.tex > file.sed.tex

这是一个有点不安全的把戏,但它会转变

"I was walking through the roads to clear my brain," he said. "And suddenly--fire, earthquake, death!" He relapsed into silence, with his chin now sunken almost to his knees. Presently he began waving his hand. "All the work--all the Sunday schools--What have we done--what has Weybridge done?

进入:

"I was walking through the roads to clear my brain," he\nopagebreak\ said. "And\nopagebreak\ suddenly--fire, earthquake, death!" He\nopagebreak\ relapsed into silence, with his chin now sunken almost to his knees. Presently\nopagebreak\ he began waving his hand. "All\nopagebreak\ the work--all the Sunday schools--What have we done--what has Weybridge done?

(它不会改变第一个“我曾经......”因为该行(和段落)从那里开始)

答案4

我从未听说过有自动功能可以做到这一点。当然,您可以~在特定句子或每个句子的第一个和第二个单词之间使用(不可中断空格)。然后 (La)TeX 就永远不会在此处中断行。

更好的办法是在这些词之间添加\nopagebreak。但是,在每个句子中都这样做会比这样做更麻烦~

相关内容