使用 lstinputlisting 与 tex4ht 会隐藏标题中的所有 ASCII 字符吗?

使用 lstinputlisting 与 tex4ht 会隐藏标题中的所有 ASCII 字符吗?

我正在使用 tex4ht 将我的 LaTex 源构建为 EPUB。目前我遇到了问题lstinputlisting,字幕(越南语)无法正确显示。

\lstinputlisting[float,language=C++,caption={Một chú thích viết bằng tiếng Việt.}]{data.cpp}

在此处输入图片描述

我检查了 tex4ht 生成的中间 XHTML 文件,似乎标题中的每个 ASCII 字符都被删除了,只留下越南语字符。最后一个可见的非 ASCII 字符被“悬空”到下一个范围:

... [redacted]
<body>
<!-- l. 8 --><p>Đạ ộúíếằế</p><pre class='lstinputlisting' id='listing-1'><span class='label'><a id='x1-4r1'></a><span class='ec-lmr-7'>ệ</span></span><span style='color:#000000'><span class='ec-lmtti-10x-x-90'>//</span></span><span style='color:#000000'> <span class='ec-lmtti-10x-x-90'>file.cpp</span> 
... [redacted]

...这样整个标题“ĐoAn 代码 1: M頁面ú第六章ế结核A還是ế吴维t.”被剥离为“Đạ ộúíếằế”和“ệ”(悬垂字符)。

令我惊讶的是,如果我使用该lstlisting环境,tex4ht 似乎不会对越南语字幕产生任何问题。但是,我不能将此用作解决方法,因为我的项目将包含大量独立代码文件。 在此处输入图片描述

你知道为什么会发生这种情况吗?我附上了一份最小可重复示例演示我当前的项目(由于我使用的是 Windows 机器,因此构建脚本是用批处理脚本编写的)。

先感谢您!

答案1

TeX4ht 使用一些特殊指令来抑制 中的不需要的字符\lstinputlisting。否则,一些不需要的代码将被插入到 HTML 输出中。不幸的是,标题也被抑制了。我们需要在这里关闭字符抑制。可以使用此版本的 来完成listings.4ht

% listings.4ht (2022-05-22-13:50), generated from tex4ht-4ht.tex
% Copyright 2001-2009 Eitan M. Gurari
% Copyright 2009-2022 TeX Users Group
%
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
%   http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
%
% This work has the LPPL maintenance status "maintained".
%
% The Current Maintainer of this work
% is the TeX4ht Project <http://tug.org/tex4ht>.
%
% If you modify this program, changing the
% version identification would be appreciated.
\immediate\write-1{version 2022-05-22-13:50}

\@ifpackageloaded{xcolor}{}{%
\RequirePackage{xcolor}
}
\def\lst@makecaption#1#2{\cptA: #1\if :#1:\else\cptB:\fi \cptC: #2\cptD:}

\newif\iflstnest
\append:defII\lst@EnterMode{%
  \ifx \lsthk:EveryLine\:UnDef
     \let\lsthk:EveryLine\lsthk@EveryLine
  \fi
  \ifx \lsthk:EveryLine\lsthk@EveryLine
      \pend:def\lsthk@EveryLine{\c:listings
             \def\dd:listings{\d:listings\let\dd:listings\empty}}%
  \fi
  \append:def\lsthk@EveryPar{\dd:listings}%
  \iflstnest\else
  \a:listings\fi\bgroup
  %\Configure{$}{}{}{}%
  \aftergroup\lst:EnterMode  }
\def\lst:EnterMode{\iflstnest\else\b:listings\fi\egroup}
\NewConfigure{listings}{4}
\let\dd:listings=\empty
\append:defI\lst@Init{\csname a:listings-init\endcsname\global\lstnesttrue}
\pend:def\lst@DeInit{\csname b:listings-init\endcsname\global\lstnestfalse}
\NewConfigure{listings-init}{2}
\lst@AddToHook{TextStyle}{%
   \Configure{listings}{}{}{}{}%
   \a:lstinline \bgroup \aftergroup\b:lstinline\aftergroup\egroup
  }
\NewConfigure{lstinline}{2}
\pend:defI\lst@MakeCaption{%
  \let\lst:addcontentsline\addcontentsline
  \def\addcontentsline{\gHAdvance\TitleCount by 1
                       \lst:addcontentsline}%
}
\append:defI\lst@MakeCaption{%
  \let\addcontentsline\lst:addcontentsline
}
\ConfigureToc{lol} {}{\empty}{}{\newline}
\lst@UserCommand\lstlistoflistings{\bgroup%
    \ifdefined\chapter\chapter*{\lstlistlistingname}\else\section*{\lstlistlistingname}\fi%
    \TableOfContents[lol]%
  \egroup}
\def\:tempa{%
   \ifx\lst@OutputBox\@gobble\else \the\everypar \fi
   \global\advance\lst@newlines\m@ne
   \lst@newlinetrue
}%
\HLet\lst@NewLine\:tempa
\def\:tempa#1{
    \begingroup%
      \lsthk@PreSet\gdef\lst@intname{#1}%
      \expandafter\lstset\expandafter{\lst@set}%
      \lsthk@DisplayStyle%
      \catcode\active=\active%
      \a:lstinputlisting
      \ht:special{t4ht@[}
      \pend:def\cptA:{\ht:special{t4ht@]}}
      \append:def\cptD:{\ht:special{t4ht@[}}
      \lst@Init\relax 
      \let\lst@gobble\z@%
      \lst@SkipToFirst%
      \lst@ifprint \def\lst@next{\input{#1}}%
             \else \let\lst@next\@empty \fi%
      \ht:special{t4ht@]}
      \lst@next\ht:special{t4ht@[}\lst@DeInit\ht:special{t4ht@]}%
      \b:lstinputlisting%
    \endgroup}

\HLet\lst@InputListing\:tempa
\NewConfigure{lstinputlisting}{2}
\def\:tempa#1{%
   \setbox\z@\hbox{{\lst@currstyle{\kern#1}}}%
   \global\advance\lst@currlwidth \wd\z@
   \tmp:dim=#1 \let\:tempc=\empty
   \loop \ifdim \tmp:dim>\a:lst@Kern
      \advance \tmp:dim by -\a:lst@Kern
      \advance \tmp:dim by -\b:lst@Kern
      \append:def\:tempc{\:nbsp}%
   \repeat
   \setbox\z@\hbox{{\lst@currstyle{\:tempc}}}%
   \lst@OutputBox\z@}
\HLet\lst@Kern\:tempa
\NewConfigure{lst@Kern}{2}
\Configure{lst@Kern}{0.499em}{0.1em}
\def\lst@outputspace{\HCode{ }}

\HLet\lst@frameInit=\empty
\HLet\lst@frameExit=\empty

\Hinput{listings}
\endinput

重要的部分是此代码:

\def\:tempa#1{
    \begingroup%
      \lsthk@PreSet\gdef\lst@intname{#1}%
      \expandafter\lstset\expandafter{\lst@set}%
      \lsthk@DisplayStyle%
      \catcode\active=\active%
      \a:lstinputlisting
      \ht:special{t4ht@[}
      \pend:def\cptA:{\ht:special{t4ht@]}}
      \append:def\cptD:{\ht:special{t4ht@[}}
      \lst@Init\relax 
      \let\lst@gobble\z@%
      \lst@SkipToFirst%
      \lst@ifprint \def\lst@next{\input{#1}}%
             \else \let\lst@next\@empty \fi%
      \ht:special{t4ht@]}
      \lst@next\ht:special{t4ht@[}\lst@DeInit\ht:special{t4ht@]}%
      \b:lstinputlisting%
    \endgroup}

\HLet\lst@InputListing\:tempa

这是的代码\lstinputlisting,我们使用这些特殊功能在标题中启用文本:

      \pend:def\cptA:{\ht:special{t4ht@]}}
      \append:def\cptD:{\ht:special{t4ht@[}}

以下是 MWE:

\documentclass[10pt,a4paper]{book}

% \usepackage{mystyle}
\usepackage{listings}
\usepackage{caption}
\usepackage{listingsutf8}
\renewcommand{\lstlistingname}{Đoạn code}
\usepackage[vietnamese]{babel}
\lstdefinestyle{customstyle}
{
    basicstyle=\small\ttfamily,
    breaklines=true,
    showspaces=false,
    showtabs=false,
    tabsize=2,
    showstringspaces=false,
    breakatwhitespace=true,
    escapeinside={(*@}{@*)},
    numbers=left,
    numberstyle=\scriptsize,
    stepnumber=1,
    numbersep=8pt,
    inputencoding=utf8,
    extendedchars=true,
  }
\lstset{language=C++, style=customstyle}

\begin{document}

%%%% NB: This doesn't show the caption correctly, ASCII characters are omitted
\lstinputlisting[float,language=C++,caption={Một chú thích viết bằng tiếng Việt.}]{data.cpp}

%%%% NB: This, surprisingly, show the caption correctly
\begin{lstlisting}[language=C++,caption={Một chú thích viết bằng tiếng Việt.}]
int main() { return 0; }
\end{lstlisting}

\end{document}

它包括文件data.cpp

// file.cpp
int gInitialised = 1234;
float gUninitialised;
const bool gIsFinal = false;

结果如下:

在此处输入图片描述

相关内容