如何使自定义连字符在 LuaTeX 中可搜索?

如何使自定义连字符在 LuaTeX 中可搜索?

在以下 MWE 中,我定义了自己的连字符,它使用 LuaTeX 工作。我现在想让它可以通过“Th”而不是“Ђ”进行搜索。使用 fontspec 或 LuaTeX 方法可以实现这一点吗?

  • 我已经看到很多类似的问题(例如一个),但没有一个答案采用通用方法来解决这个问题。
  • pdfglyphtounicode我认为,与在 LuaTeX 中使用等效的有关
\documentclass[a4paper]{article}
\usepackage{fontspec}

\directlua
{

    fonts.handlers.otf.addfeature
    {
        name = "ligaxits",
        type = "ligature",
        data =
        {
            [0x0402] = { "T", "h" },
        },
    }

}

\begin{document}

    \setmainfont[
        RawFeature={+ligaxits},
    ]{XITS-Regular.otf}

    The% the "ligature" is now used as expected, but I'd like to make it searchable by "Th" 

\end{document}

定制连字

答案1

感谢 Ulrike Fischer 为我指明了正确的方向,我终于让它工作了,可能适用于所有字形和所有字体。我认为代码甚至比链接代码更快(O(1) 相比于 O(N),但我是 Lua 初学者),并且链接代码不适用于所有字形,因为并非所有字形名称都在表中,尤其是那些uni缺少前缀的字形名称。

\documentclass[a4paper]{article}

\usepackage{fontspec}
\usepackage{luacode}

\begin{luacode}
-- the following code is for creating the ligature only; not for making it copyable/searchable
fonts.handlers.otf.addfeature{
    name = "ligacustom",
    type = "ligature",
    data =
    {
        [utf.byte("Ђ")] = {utf.byte("T"), utf.byte("h")},
    },
}

-- the following code is for debugging only
local dump = function(o)
    -- source: https://stackoverflow.com/a/27028488
   if type(o) == 'table' then
      local s = '{ '
      for k,v in pairs(o) do
         if type(k) ~= 'number' then k = '"'..k..'"' end
         s = s .. '['..k..'] = ' .. dump(v) .. ','
      end
      return s .. '} '
   else
      return tostring(o)
   end
end

-- the following code is for making generic glyphs copyable/searchable as wanted; even from private use area
local patch_make_custom_glyphs_searchable_xits = function(fontdata)
    if fontdata.fontname == "XITS-Regular" -- when the patch should only apply to XITS Regular font
    then
        -- for another font you can print the font name to console using print(fontdata.fontname) 
        -- add as many as you want; utf.byte("ß") is same as python3 ord("ß") for testing
        fontdata.characters[utf.byte("Ђ")]["tounicode"] = {utf.byte("T"), utf.byte("h")}
        fontdata.characters[utf.byte("ß")]["tounicode"] = {utf.byte("Ä"), utf.byte("Ä"), utf.byte("Ä")}
    end
   -- print(dump(fontdata.characters))
end

luatexbase.add_to_callback
(
    "luaotfload.patch_font",
    patch_make_custom_glyphs_searchable_xits,
    "patch_make_custom_glyphs_searchable_xits"
)
\end{luacode}




\begin{document}
% setmainfont after the lua code!
\setmainfont[
    RawFeature={+ligacustom},
]{XITS-Regular.otf}
 
Ђ The ß% copies in Sumatra PDF and Adobe Reader as "Th The ÄÄÄ" using "XITS-Regular.otf"

\end{document}

苏门答腊 pdf

相关内容