我正在使用 tex4ht 将我的 LaTex 源构建为 EPUB。目前我遇到了问题lstinputlisting
,字幕(越南语)无法正确显示。
\lstinputlisting[float,language=C++,caption={Một chú thích viết bằng tiếng Việt.}]{data.cpp}
我检查了 tex4ht 生成的中间 XHTML 文件,似乎标题中的每个 ASCII 字符都被删除了,只留下越南语字符。最后一个可见的非 ASCII 字符被“悬空”到下一个范围:
... [redacted]
<body>
<!-- l. 8 --><p>Đạ ộúíếằế</p><pre class='lstinputlisting' id='listing-1'><span class='label'><a id='x1-4r1'></a><span class='ec-lmr-7'>ệ</span></span><span style='color:#000000'><span class='ec-lmtti-10x-x-90'>//</span></span><span style='color:#000000'> <span class='ec-lmtti-10x-x-90'>file.cpp</span>
... [redacted]
...这样整个标题“ĐoAn 代码 1: Mộ頁面ú日我第六章ế结核A還是ế吴维ệt.”被剥离为“Đạ ộúíếằế”和“ệ”(悬垂字符)。
令我惊讶的是,如果我使用该lstlisting
环境,tex4ht 似乎不会对越南语字幕产生任何问题。但是,我不能将此用作解决方法,因为我的项目将包含大量独立代码文件。
你知道为什么会发生这种情况吗?我附上了一份最小可重复示例演示我当前的项目(由于我使用的是 Windows 机器,因此构建脚本是用批处理脚本编写的)。
先感谢您!
答案1
TeX4ht 使用一些特殊指令来抑制 中的不需要的字符\lstinputlisting
。否则,一些不需要的代码将被插入到 HTML 输出中。不幸的是,标题也被抑制了。我们需要在这里关闭字符抑制。可以使用此版本的 来完成listings.4ht
:
% listings.4ht (2022-05-22-13:50), generated from tex4ht-4ht.tex
% Copyright 2001-2009 Eitan M. Gurari
% Copyright 2009-2022 TeX Users Group
%
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
% http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
%
% This work has the LPPL maintenance status "maintained".
%
% The Current Maintainer of this work
% is the TeX4ht Project <http://tug.org/tex4ht>.
%
% If you modify this program, changing the
% version identification would be appreciated.
\immediate\write-1{version 2022-05-22-13:50}
\@ifpackageloaded{xcolor}{}{%
\RequirePackage{xcolor}
}
\def\lst@makecaption#1#2{\cptA: #1\if :#1:\else\cptB:\fi \cptC: #2\cptD:}
\newif\iflstnest
\append:defII\lst@EnterMode{%
\ifx \lsthk:EveryLine\:UnDef
\let\lsthk:EveryLine\lsthk@EveryLine
\fi
\ifx \lsthk:EveryLine\lsthk@EveryLine
\pend:def\lsthk@EveryLine{\c:listings
\def\dd:listings{\d:listings\let\dd:listings\empty}}%
\fi
\append:def\lsthk@EveryPar{\dd:listings}%
\iflstnest\else
\a:listings\fi\bgroup
%\Configure{$}{}{}{}%
\aftergroup\lst:EnterMode }
\def\lst:EnterMode{\iflstnest\else\b:listings\fi\egroup}
\NewConfigure{listings}{4}
\let\dd:listings=\empty
\append:defI\lst@Init{\csname a:listings-init\endcsname\global\lstnesttrue}
\pend:def\lst@DeInit{\csname b:listings-init\endcsname\global\lstnestfalse}
\NewConfigure{listings-init}{2}
\lst@AddToHook{TextStyle}{%
\Configure{listings}{}{}{}{}%
\a:lstinline \bgroup \aftergroup\b:lstinline\aftergroup\egroup
}
\NewConfigure{lstinline}{2}
\pend:defI\lst@MakeCaption{%
\let\lst:addcontentsline\addcontentsline
\def\addcontentsline{\gHAdvance\TitleCount by 1
\lst:addcontentsline}%
}
\append:defI\lst@MakeCaption{%
\let\addcontentsline\lst:addcontentsline
}
\ConfigureToc{lol} {}{\empty}{}{\newline}
\lst@UserCommand\lstlistoflistings{\bgroup%
\ifdefined\chapter\chapter*{\lstlistlistingname}\else\section*{\lstlistlistingname}\fi%
\TableOfContents[lol]%
\egroup}
\def\:tempa{%
\ifx\lst@OutputBox\@gobble\else \the\everypar \fi
\global\advance\lst@newlines\m@ne
\lst@newlinetrue
}%
\HLet\lst@NewLine\:tempa
\def\:tempa#1{
\begingroup%
\lsthk@PreSet\gdef\lst@intname{#1}%
\expandafter\lstset\expandafter{\lst@set}%
\lsthk@DisplayStyle%
\catcode\active=\active%
\a:lstinputlisting
\ht:special{t4ht@[}
\pend:def\cptA:{\ht:special{t4ht@]}}
\append:def\cptD:{\ht:special{t4ht@[}}
\lst@Init\relax
\let\lst@gobble\z@%
\lst@SkipToFirst%
\lst@ifprint \def\lst@next{\input{#1}}%
\else \let\lst@next\@empty \fi%
\ht:special{t4ht@]}
\lst@next\ht:special{t4ht@[}\lst@DeInit\ht:special{t4ht@]}%
\b:lstinputlisting%
\endgroup}
\HLet\lst@InputListing\:tempa
\NewConfigure{lstinputlisting}{2}
\def\:tempa#1{%
\setbox\z@\hbox{{\lst@currstyle{\kern#1}}}%
\global\advance\lst@currlwidth \wd\z@
\tmp:dim=#1 \let\:tempc=\empty
\loop \ifdim \tmp:dim>\a:lst@Kern
\advance \tmp:dim by -\a:lst@Kern
\advance \tmp:dim by -\b:lst@Kern
\append:def\:tempc{\:nbsp}%
\repeat
\setbox\z@\hbox{{\lst@currstyle{\:tempc}}}%
\lst@OutputBox\z@}
\HLet\lst@Kern\:tempa
\NewConfigure{lst@Kern}{2}
\Configure{lst@Kern}{0.499em}{0.1em}
\def\lst@outputspace{\HCode{ }}
\HLet\lst@frameInit=\empty
\HLet\lst@frameExit=\empty
\Hinput{listings}
\endinput
重要的部分是此代码:
\def\:tempa#1{
\begingroup%
\lsthk@PreSet\gdef\lst@intname{#1}%
\expandafter\lstset\expandafter{\lst@set}%
\lsthk@DisplayStyle%
\catcode\active=\active%
\a:lstinputlisting
\ht:special{t4ht@[}
\pend:def\cptA:{\ht:special{t4ht@]}}
\append:def\cptD:{\ht:special{t4ht@[}}
\lst@Init\relax
\let\lst@gobble\z@%
\lst@SkipToFirst%
\lst@ifprint \def\lst@next{\input{#1}}%
\else \let\lst@next\@empty \fi%
\ht:special{t4ht@]}
\lst@next\ht:special{t4ht@[}\lst@DeInit\ht:special{t4ht@]}%
\b:lstinputlisting%
\endgroup}
\HLet\lst@InputListing\:tempa
这是的代码\lstinputlisting
,我们使用这些特殊功能在标题中启用文本:
\pend:def\cptA:{\ht:special{t4ht@]}}
\append:def\cptD:{\ht:special{t4ht@[}}
以下是 MWE:
\documentclass[10pt,a4paper]{book}
% \usepackage{mystyle}
\usepackage{listings}
\usepackage{caption}
\usepackage{listingsutf8}
\renewcommand{\lstlistingname}{Đoạn code}
\usepackage[vietnamese]{babel}
\lstdefinestyle{customstyle}
{
basicstyle=\small\ttfamily,
breaklines=true,
showspaces=false,
showtabs=false,
tabsize=2,
showstringspaces=false,
breakatwhitespace=true,
escapeinside={(*@}{@*)},
numbers=left,
numberstyle=\scriptsize,
stepnumber=1,
numbersep=8pt,
inputencoding=utf8,
extendedchars=true,
}
\lstset{language=C++, style=customstyle}
\begin{document}
%%%% NB: This doesn't show the caption correctly, ASCII characters are omitted
\lstinputlisting[float,language=C++,caption={Một chú thích viết bằng tiếng Việt.}]{data.cpp}
%%%% NB: This, surprisingly, show the caption correctly
\begin{lstlisting}[language=C++,caption={Một chú thích viết bằng tiếng Việt.}]
int main() { return 0; }
\end{lstlisting}
\end{document}
它包括文件data.cpp
:
// file.cpp
int gInitialised = 1234;
float gUninitialised;
const bool gIsFinal = false;
结果如下: