我正在pictureenv
使用 使用 SVG 时,tex4ht 与表格内和表格外的数学运算发生冲突
当向包含的表中添加第二列时,它会因为某种原因将列表中任何位置的listings
字符[
和更改为字母。_
x
这是 MWE
\documentclass[11pt]{article}
\usepackage{amsmath,mathtools,amssymb}
\usepackage{listings}
\usepackage{pictureenv}
\begin{document}
\ifdefined\HCode
\begin{pictureenv}
\fi
\begin{tabular}{|p{3in}|p{2.5in}|}\hline
${\frac {d}{{d}x}}y \left( x \right) = \left( -2+x \right) ^{2}$&
\begin{lstlisting}
[_quadrature_]
\end{lstlisting}\\\hline
\end{tabular}
\ifdefined\HCode
\end{pictureenv}
\fi
\end{document}
编译为
make4ht -ulm default -f html5+dvisvgm_hashes T.tex "htm,pic-m,pic-align,svg,p-width"
我也试过
make4ht -ulm default -f html5+dvisvgm_hashes T.tex "htm,pic-tabular,pic-align,svg,p-width"
给予
pictureenv
比较不使用时的输出:
\documentclass[11pt]{article}
\usepackage{amsmath,mathtools,amssymb}
\usepackage{listings}
\usepackage{pictureenv}
\begin{document}
\begin{tabular}{|p{3in}|p{2.5in}|}\hline
${\frac {d}{{d}x}}y \left( x \right) = \left( -2+x \right) ^{2}$&
\begin{lstlisting}
[_quadrature_]
\end{lstlisting}\\\hline
\end{tabular}
\end{document}
使用相同的编译命令,现在它给出
Linux Ubuntu 上的 TL 2018
答案1
您看到的是 Unicode 支持的结果tex4ht
。它通过在文档中插入两个代码来工作。第一个是特殊指令,它告诉 tex4ht 用指令中存储的 Unicode 值替换下一个字符,第二个是将被替换的字符。它通常是x
,但它可以是任何字符。它只是用于将字体信息传递给 tex4ht,因此它可以将 Unicode 渲染为粗体、斜体等。
问题出在图片上,因为它们是由外部命令生成的,通常dvisvgm
是 或dvipng
。它们不知道如何处理 tex4ht 特殊字符,因此它们将被忽略,只x
显示 。
我们可以尝试使用 LuaTeX 来解决这个问题。可以使用节点回调来处理文档节点,检测图片并手动替换字符。这并不像听起来那么容易,因为我们不能只将 Unicode 值设置为替换的字符。而是需要设置正确的字形编号。Unicode 和特定 TeX 字体中的字形之间没有通用的映射。幸运的是,tex4ht 以以下形式为大多数 TeX 字体提供了此类映射HTF 文件。可以编写 Lua 库来搜索 HTF 文件并解析映射。
事实证明这是相当复杂的事情,我必须承认我在图片生成中发现了一个严重的问题。有时 Unicode 值和字体字形之间的映射不存在。例如,即使\textellipsis
使用此方法,该命令也不起作用。这在实践中不应该是问题,因为这种图片限制已经存在了很长一段时间,但没有人抱怨过。这只是我发现的一个限制,目前我找不到解决方案。
介绍已经足够了,我们现在可以开始讨论代码了。
首先,我们需要HTF文件库htffontreader.lua
:
kpse.set_program_name "luatex"
local entities = require "luaxml-entities"
local texmfdist = kpse.expand_var("$TEXMFDIST")
local default_paths = {
texmfdist .. "/tex4ht/ht-fonts/mozilla/",
texmfdist .. "/tex4ht/ht-fonts/unicode/",
texmfdist .. "/tex4ht/ht-fonts/ascii/",
texmfdist .. "/tex4ht/ht-fonts/alias/"
}
local function str_to_table(str)
local characters = {}
str:gsub(".", function(a) table.insert(characters, a) end)
return characters
end
-- convert the .4ht string field to a Unicode codepoint
local function get_char(str)
-- it is necessary to decode XML entites first
local newstr = entities.decode(str)
-- get Unicode codepoints of the string
local chars = {}
-- the string.utfvalues is LuaTeX extension
for codepoint in string.utfvalues(newstr) do
chars[#chars+1] = codepoint
end
-- return whole string if there is more than one codepoint
-- it is useless in tex4ht char to node.char mapping
if #chars > 1 then return newstr end
return chars[1]
end
local function read_file(filename)
local f = io.open(filename, "r")
if not f then return nil, "Cannot open file " .. filename end
local content = f:read("*all")
f:close()
return content
end
local function traverse_htf_files(dir, addresses)
-- local addresses = addresses or {}
for file in lfs.dir(dir) do
-- skip current and parent dir links"
if file ~= "." and file ~=".." then
local current_path = dir .. file
local attr = lfs.attributes(current_path)
if attr.mode == "directory" then
traverse_htf_files(current_path .. "/", addresses)
elseif attr.mode == "file" then
if file:match("htf$") then
file = file:gsub(".htf$", "")
-- print(current_path, attr.mode)
addresses[file] = current_path
end
end
end
end
return addresses
end
-- find all .htf and .4hf files in list of directories
local function find_htf_files(directories)
local addresses = {}
for _, dir in ipairs(directories) do
addresses = traverse_htf_files(dir, addresses)
end
return addresses
end
-- the htf files may contain only part of the font file name
-- we must build graph for efficient lookup for the correct
-- corresponding htf file
local function make_lookup_table(addresses)
local function step(characters, lookup)
if #characters > 0 then
local char = table.remove(characters,1)
local subtab = lookup[char] or {}
lookup[char] = step(characters, subtab)
end
return lookup
end
local lookup = {}
for file, _ in pairs(addresses) do
-- get individual characters as a table
local characters = str_to_table(file)
lookup = step(characters, lookup)
end
return lookup
end
local function lookup_font(font_name, lookup_table)
local function lookup(characters, tbl)
if #characters < 1 then return "" end
local char = table.remove(characters, 1)
local subtab = tbl[char]
if not subtab then return "" end
return char .. lookup(characters, subtab)
end
local characters = str_to_table(font_name)
return lookup(characters, lookup_table)
end
local function get_htf_css(content)
local htfcss = {}
for name, style in content:gmatch("htfcss:%s*([%w]+)%s*([^\n]+)") do
htfcss[name] = style
end
return htfcss
end
local function parse_htf_line(line)
-- details about the htf file: https://tug.org/applications/tex4ht/mn-htf.html
-- from the manual:
-- The ‘string’ field may include any sequence of characters, except for
-- its delimiters. The backslash character ‘\’ acts there as an escaped
-- character. It may act as a delimiter for a character code, or be
-- followed by another backslash (that is, ‘\\’ represents the character
-- ‘\’ ).
-- In the string part, use ‘<’ for the character ‘<’, ‘>’ for ‘>’, and ‘&’ for ‘&’;
local escape = function(str)
local str = str or ""
str = str:gsub("\\\\", "\\"):gsub("\\'","'")
return str
end
local str, class = line:match("^%s*'(.-)'%s+'([0-9]*)'")
-- from the manual:
-- A ‘class’ specified by an odd integer value asks for a
-- pictorial character. An even integer number asks for a non-pictorial
-- character, specified in the ‘string’ field. An empty class field is
-- treated as a zero value.
if not str then return nil, "Cannot parse htf line: " .. line end
class = class or "" -- add default value
class = tonumber(class) or 0 -- convert empty class to zero
return escape(str), class
end
local function parse_htf_glyphs(content, addresses)
local map = {}
local backmap = {}
local readpos = 0
local function readline()
local start
start, readpos, line = content:find("([^\n]-)\n", readpos)
-- print(readpos, line)
readpos = readpos + 1
return line
end
-- first detect if the htf file isn't only link to another one
local link = content:match("^%s*%.([^%s]+)")
if link then
local newfile = addresses[link]
if not newfile then return nil, "Cannot load htf file for ".. link end
local content = read_file(newfile)
return parse_htf_glyphs(content, addresses)
end
-- read htf name, start char and end char
local firstline = readline()
local name, start, finish = firstline:match("^([^%s]+)%s+([%d]+)%s+([%d]+)")
if not name then return nil, "cannot parse htf file" end
-- convert the values to numbers
local start, finish = tonumber(start), tonumber(finish)
-- calculate number of lines to be read
local count = finish - start - 1
for i = 1, count do
local line = readline()
-- char may be character code or list of character codes
local str, class = parse_htf_line(line)
local char = get_char(str)
-- print(start, line)
-- print(start, str, class, char)
-- map character code to the tfm font position
if char then
map[char] = start
end
-- map tfm position to tex4ht character class and the replacement strin
backmap[start] = {class = class, str = str}
start = start + 1
end
print("Parse htf font", name, start, finish)
return map, backmap
end
local function load_font(font_name, addresses)
--- todo: continue here
local content, msg = read_file(font_name)
if not content then return nil, msg end
local htfcss = get_htf_css(content)
-- return two tables, one from unicode to font positions, the other in the other direction
local map, backmap = parse_htf_glyphs(content, addresses)
return {htfcss = htfcss, map = map, backmap = backmap}
end
local function get_font(font_name, lookup_table, addresses)
local htf_name = lookup_font(font_name, lookup_table)
if htf_name and htf_name ~= "" then
local font_file = addresses[htf_name]
-- this shouldn't happen
if not font_file then return nil, "Cannot find font file: " .. htf_name end
return load_font(font_file, addresses)
else
return nil, "Cannot find HTF font: " .. font_name
end
end
local function htfobject(paths)
local paths = paths or default_paths
local htfont = {}
htfont.font_cache = {}
htfont.addresses, msg = find_htf_files(paths)
if not htfont.addresses then return nil, msg end
htfont.lookup_table = make_lookup_table(htfont.addresses)
function htfont:get_font(fontname)
local f = self.font_cache[fontname] or get_font(fontname, self.lookup_table, self.addresses)
self.font_cache[fontname] = f
return f
end
htfont.__index = htfont
return setmetatable({}, htfont)
end
-- some testing
if arg[0] == "htffontreader.lua" then
local htfx = htfobject()
local cmsy = htfx:get_font("rm-lmr10")
-- print(get_font("cmsy10", lookup_table, addresses))
-- print(get_font("cmmi10", lookup_table, addresses))
-- print(get_font("lm-ec1000", lookup_table, addresses))
local cmss = htfx:get_font("cmss")
for name, style in pairs(cmss.htfcss) do
print(name, style)
end
end
local M = {}
M.htfobject = htfobject
return M
图片处理回调位于fixpictures4ht.lua
库中:
local htffontreader = require "htffontreader"
local hlist_id = node.id "hlist"
local vlist_id = node.id "vlist"
local whatsit_id = node.id "whatsit"
local glyph_id = node.id "glyph"
-- get the special subtype
local whatsits = node.whatsits()
local special_id
-- font database object
local fontdb = htffontreader.htfobject()
local supported_htf_fonts
-- from Luaotfload documentation
local function unsafe_getfont (id)
local tfmdata = font.getfont (id)
if not tfmdata then
tfmdata = font.fonts[id]
end
return tfmdata
end
local font_infos = {}
local function get_font_info(id)
local info = font_infos[id]
if info then return info end
local tfmdata = unsafe_getfont(id)
local name = tfmdata.name
local format = tfmdata.properties.format
font_infos[id] = name
print("Loading htf file for " .. name)
return name
end
local utfchar = unicode.utf8.char
local in_picture = false
local function execute_tex4ht(head, n)
local was_tex4ht = false
local t4ht, data = n.data:match("(t4ht)(.+)")
if t4ht == "t4ht" then was_tex4ht = true end
if was_tex4ht then
if in_picture then
-- tex4ht.sty definition for the \Picture(+|*) commands redefines the \ht:special command to propend t4ht+ in fornt of
-- the special code. I guess that the tex4ht command then somehow handles that, but I didn't investigate that. anyway,
-- we need to remove the spurious +t4ht part
data = data:gsub("^%+t4ht","")
end
if in_picture and data:match("^@") then
-- interpolate tex4ht escaped entities
data = data:gsub("{([0-9]+)}", function(x) return string.char(x) end)
-- detect hexadecimal entities
local char = data:match("%&%#x([0-9a-fA-F]+);")
if char then
char = tonumber(char, 16)
else
-- decimal entity
char = data:match("^@([0-9]+)") or data:match("^@%&%#([0-9]+;")
if char then
char = tonumber(char)
end
end
if char then
-- we must replace the next glyph char with contents of this special
local nextnode = n.next
if nextnode.id == glyph_id then
-- it is necessary to do new kerning
local font_name = get_font_info(nextnode.font)
local fontdata = fontdb:get_font(font_name)
local nextchar = fontdata.map[char]
if nextchar then
nextnode.char = nextchar
else
-- the character is not available in the htf file. why?
-- one possibility is the non breaking space
if char == 160 or char==32 then
-- replace it with ordinary space?
local glue = node.new("glue")
glue.width = tex.sp(".6em")
n.next = glue
glue.next = nextnode.next
end
end
end
else
print("data", data)
end
elseif data:match("%+%+") then
local picture_name = data:match("%+%+(.+)")
-- sometimes we match something different than filename
-- so try to detect that it is really a filename (we check that it ends
-- with extension)
if picture_name:match("%.[a-zA-Z]-$") then
print("start picture", picture_name)
in_picture = true
-- pagelist[picture_name] = tex.count[ "c@page" ]
end
elseif data == "+" then
print "end picture"
in_picture = false
end
end
return head, was_tex4ht
end
local function process(head)
for n in node.traverse(head) do
local id = n.id
if id == hlist_id or id == vlist_id then
n.head = process(n.head)
elseif id == whatsit_id and (n.subtype == special_id or whatsits[n.subtype] == "special") then
special_id = n.subtype
-- act on the special node and detect if it was tex4ht special
local was_tex4ht
head, was_tex4ht= execute_tex4ht(head, n)
end
end
return head
end
local M = {}
M.process = process
return M
必须安装回调,这可以在tuenc-luatex.4ht
文件的重新定义版本中完成:
% tuenc-luatex.4ht, generated from tex4ht-4ht.tex
% Copyright 2017 TeX Users Group
%
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
% http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
%
% This work has the LPPL maintenance status "maintained".
%
% This Current Maintainer of this work
% is the TeX4ht Project <[email protected]>.
%
% If you modify this program, changing the
% version identification would be appreciated.
\immediate\write-1{version 2017-01-24-15:21}
\RequirePackage{luatexbase}
\RequirePackage{luacode}
\begin{luacode*}
local fontspec = require "fontspec-4ht"
local fixfonts = require "fixpictures4ht"
luatexbase.add_to_callback("pre_linebreak_filter", fontspec.char_to_entity, "Char to entity")
luatexbase.add_to_callback("hpack_filter", fontspec.char_to_entity, "hpack-char-to-entity")
luatexbase.add_to_callback("pre_linebreak_filter", fixfonts.process, "Fix unicode in pictures")
\end{luacode*}
\Hinput{tuenc-luatex}
\endinput
还有一个问题是列表的默认配置相当复杂,并且重新定义了很多东西。你不想让它处于图片模式,所以我们必须配置环境pictureenv
以忽略其中的大部分内容:
\Preamble{xhtml}
\ConfigureEnv{pictureenv}{%
\Configure{listings-init}{\special{t4ht@(}\ttfamily}{\special{t4ht@)}}
\ConfigureEnv{lstlisting}{}{}{}{}
\Configure{listings}{{\leavevmode}}{}{}{\newline}
\Picture*{}}{\EndPicture}{}{}
\begin{document}
\EndPreamble
配置
\Configure{listings}{{\leavevmode}}{}{}{\newline}
对于多行列表尤其重要,因为默认配置会导致它们折叠为一行。
我准备了一个带有更多说明的示例:
\documentclass[11pt]{article}
\usepackage{amsmath,mathtools,amssymb}
\usepackage{listings}
\usepackage{pictureenv}
\begin{document}
\begin{pictureenv}
\begin{tabular}{|p{3in}|p{2.5in}|}\hline
${\frac {d}{{d}x}}y \left( x \right) = \left( -2+x \right) ^{2}$&
\begin{lstlisting}
[_quadrature_]
\end{lstlisting}\\\hline
\end{tabular}
\end{pictureenv}
divný příliš~žluťoučký kůň \textunderscore
\begin{pictureenv}
\begin{lstlisting}
\verb|now_|@/$
some spaces
no spaces
\end{lstlisting}
divný příliš~žluťoučký kůň \textunderscore
\begin{verbatim}
\verb|now_|@/$
\end{verbatim}
\end{pictureenv}
\end{document}
这是默认渲染(无需listings
配置!):
这是处理后的结果fixpictures4ht.lua
: