使用 lualatex 是否可以直接在数学模式下输入组合变音符号的 utf8 格式？

Question 1

我使用 lua 脚本让它工作。您的最小示例如下：

\documentclass{minimal}
\usepackage{unicode-math}
\setmathfont{XITS Math}
\AtBeginDocument{\directlua{require("combining_preprocessor.lua")}}
\newcommand{\⃗}[1]{\ensuremath{\vec{#1}}}
\begin{document}
$v⃗$
\end{document}

这个想法是，让 LaTeX 处理来自后它的参数，这就是 Unicode 组合字符的工作方式，因此我们使用预处理器来移动重音符，使其位于其参数之前。也就是说，在脚本中映射v⃗到，然后定义您想要的任何操作。（这是一个反斜杠，后跟一个组合箭头，应该打印在反斜杠上方。）\⃗{v}\⃗

我的 lua 脚本可以完成大多数（全部？）组合字符，因此您只需要在文件中定义它们应该做什么.tex。同一字符上可以有多个重音。示例：

\documentclass{minimal}

\usepackage{unicode-math}
\setmathfont{XITS Math}

\AtBeginDocument{\directlua{require("combining_preprocessor.lua")}}

\newcommand{\̂}[1]{\ensuremath{\hat{#1}}}
\newcommand{\⃑}[1]{\ensuremath{\vec{#1}}}
\newcommand{\̱}[1]{\ensuremath{\underline{#1}}}
\newcommand{\́}[1]{\ensuremath{\acute{#1}}}

\usepackage{stackrel}
\newcommand{\᷽}[1]{\ensuremath{\stackrel[\approx]{}{#1}}}

\begin{document}

Hello

$ℂ̂$ is hat on $ℂ$, more on $ℂ̂⃑$ (stress test)

$ℂ̂ x̂$

Many combining accents on $x᷽̱̂́⃑$ is cool.

\end{document}

（我的浏览器无法正确显示这里的许多组合字符，但它在 PDF 文件中看起来不错。）

不确定这是否是理想的做事方式，但就其价值而言，情况如下combining_preprocessor.lua：

function minornil(a, b)
   if a == nil and b == nil then
      return nil
   elseif a == nil then
      return b
   elseif b == nil then
      return a
   else
      return math.min(a, b)
   end
end

function findfirstcombining(line, n)
   local a = string.find(line, "\204[\128-\191]", n)     -- From U0300,
   local b = string.find(line, "\205[\128-\175]", n)     -- to U036F.
   a = minornil(a, b)
   b = string.find(line, "\226\131[\144-\176]", n) -- U20D0 to U20F0
   a = minornil(a, b)
   b = string.find(line, "\225\183[\128-\191]", n) -- U1DC0 to U1DFF
   a = minornil(a, b)
   return a
end

function is_utf8_continuation(byte)
   return byte < 191 and byte > 127
end

function find_next_utf8_char(str, n)
   while str:byte(n) ~= nil and is_utf8_continuation(str:byte(n)) do
      n = n + 1
   end
   return n
end

function combining_iter(str)
   local n = 0
   return function ()
      n = (n ~= nil) and findfirstcombining(str, n + 1)
      return n
   end
end

function dobuffer(line)
   local n1 = 0
   local t = {}
   for n2 in combining_iter(line) do
      if n2 > n1 then
         local n3 = n2
         repeat
            n3 = n3 - 1
         until not is_utf8_continuation(line:byte(n3))
         table.insert(t, string.sub(line, n1, n3 - 1))
         n1 = find_next_utf8_char(line, n2 + 1)
         local comb = {}
         table.insert(comb, "\\" .. string.sub(line, n2, n1 - 1) .. "{")
         table.insert(comb, string.sub(line, n3, n2 - 1) .. "}")
         n2 = findfirstcombining(line, n1)
         while n2 == n1 do
            n1 = find_next_utf8_char(line, n2 + 1)
            table.insert(comb, 1, "\\" .. line:sub(n2, n1 - 1) .. "{")
            table.insert(comb, "}")
            n2 = findfirstcombining(line, n1)
         end
         table.insert(t, table.concat(comb))
      end
   end
   table.insert(t, string.sub(line, n1))
   return table.concat(t)
end

luatexbase.add_to_callback("process_input_buffer",
                           dobuffer, "combining_preprocessor", 1)

Answer

我使用 lua 脚本让它工作。您的最小示例如下：

\documentclass{minimal}
\usepackage{unicode-math}
\setmathfont{XITS Math}
\AtBeginDocument{\directlua{require("combining_preprocessor.lua")}}
\newcommand{\⃗}[1]{\ensuremath{\vec{#1}}}
\begin{document}
$v⃗$
\end{document}

这个想法是，让 LaTeX 处理来自后它的参数，这就是 Unicode 组合字符的工作方式，因此我们使用预处理器来移动重音符，使其位于其参数之前。也就是说，在脚本中映射v⃗到，然后定义您想要的任何操作。（这是一个反斜杠，后跟一个组合箭头，应该打印在反斜杠上方。）\⃗{v}\⃗

我的 lua 脚本可以完成大多数（全部？）组合字符，因此您只需要在文件中定义它们应该做什么.tex。同一字符上可以有多个重音。示例：

\documentclass{minimal}

\usepackage{unicode-math}
\setmathfont{XITS Math}

\AtBeginDocument{\directlua{require("combining_preprocessor.lua")}}

\newcommand{\̂}[1]{\ensuremath{\hat{#1}}}
\newcommand{\⃑}[1]{\ensuremath{\vec{#1}}}
\newcommand{\̱}[1]{\ensuremath{\underline{#1}}}
\newcommand{\́}[1]{\ensuremath{\acute{#1}}}

\usepackage{stackrel}
\newcommand{\᷽}[1]{\ensuremath{\stackrel[\approx]{}{#1}}}

\begin{document}

Hello

$ℂ̂$ is hat on $ℂ$, more on $ℂ̂⃑$ (stress test)

$ℂ̂ x̂$

Many combining accents on $x᷽̱̂́⃑$ is cool.

\end{document}

（我的浏览器无法正确显示这里的许多组合字符，但它在 PDF 文件中看起来不错。）

不确定这是否是理想的做事方式，但就其价值而言，情况如下combining_preprocessor.lua：

function minornil(a, b)
   if a == nil and b == nil then
      return nil
   elseif a == nil then
      return b
   elseif b == nil then
      return a
   else
      return math.min(a, b)
   end
end

function findfirstcombining(line, n)
   local a = string.find(line, "\204[\128-\191]", n)     -- From U0300,
   local b = string.find(line, "\205[\128-\175]", n)     -- to U036F.
   a = minornil(a, b)
   b = string.find(line, "\226\131[\144-\176]", n) -- U20D0 to U20F0
   a = minornil(a, b)
   b = string.find(line, "\225\183[\128-\191]", n) -- U1DC0 to U1DFF
   a = minornil(a, b)
   return a
end

function is_utf8_continuation(byte)
   return byte < 191 and byte > 127
end

function find_next_utf8_char(str, n)
   while str:byte(n) ~= nil and is_utf8_continuation(str:byte(n)) do
      n = n + 1
   end
   return n
end

function combining_iter(str)
   local n = 0
   return function ()
      n = (n ~= nil) and findfirstcombining(str, n + 1)
      return n
   end
end

function dobuffer(line)
   local n1 = 0
   local t = {}
   for n2 in combining_iter(line) do
      if n2 > n1 then
         local n3 = n2
         repeat
            n3 = n3 - 1
         until not is_utf8_continuation(line:byte(n3))
         table.insert(t, string.sub(line, n1, n3 - 1))
         n1 = find_next_utf8_char(line, n2 + 1)
         local comb = {}
         table.insert(comb, "\\" .. string.sub(line, n2, n1 - 1) .. "{")
         table.insert(comb, string.sub(line, n3, n2 - 1) .. "}")
         n2 = findfirstcombining(line, n1)
         while n2 == n1 do
            n1 = find_next_utf8_char(line, n2 + 1)
            table.insert(comb, 1, "\\" .. line:sub(n2, n1 - 1) .. "{")
            table.insert(comb, "}")
            n2 = findfirstcombining(line, n1)
         end
         table.insert(t, table.concat(comb))
      end
   end
   table.insert(t, string.sub(line, n1))
   return table.concat(t)
end

luatexbase.add_to_callback("process_input_buffer",
                           dobuffer, "combining_preprocessor", 1)

Question 2

unicode-math没有像设置\mathcode其他 Unicode 字符（例如数学斜体）那样设置 Unicode 重音符号，因此 TeX 在第一个数学字体中寻找它们，该字体是 Computer Modern Math Italic（cmmi10在日志中），它没有重音符号（至少在 Unicode 位置上没有）。

但即使unicode-math设置了，\mathcode数学重音也不会被正确定位（正如您已经注意到的），因为必须使用\(U|XeTeX)mathaacent原始重音调用重音才能让 TeX 发挥其数学重音定位魔法。

可能可以将重音变成活动的数学字符并将它们映射到相应的宏（unicode-math已经做了这种棘手的事情以允许直接输入其他 Unicode 字符），但这留给读者练习（阅读：我不知道如何做到这一点，上次我试图理解那段代码时，我几乎失去理智了）。

引擎本身对 Unicode 字符一无所知，用户（或宏包编写者）有责任使用适当的原始和/或数学代码来告诉它哪个字符应被视为重音符号、大运算符或开符号等（否则事情会非常不灵活）。

Answer

unicode-math没有像设置\mathcode其他 Unicode 字符（例如数学斜体）那样设置 Unicode 重音符号，因此 TeX 在第一个数学字体中寻找它们，该字体是 Computer Modern Math Italic（cmmi10在日志中），它没有重音符号（至少在 Unicode 位置上没有）。

但即使unicode-math设置了，\mathcode数学重音也不会被正确定位（正如您已经注意到的），因为必须使用\(U|XeTeX)mathaacent原始重音调用重音才能让 TeX 发挥其数学重音定位魔法。

可能可以将重音变成活动的数学字符并将它们映射到相应的宏（unicode-math已经做了这种棘手的事情以允许直接输入其他 Unicode 字符），但这留给读者练习（阅读：我不知道如何做到这一点，上次我试图理解那段代码时，我几乎失去理智了）。

引擎本身对 Unicode 字符一无所知，用户（或宏包编写者）有责任使用适当的原始和/或数学代码来告诉它哪个字符应被视为重音符号、大运算符或开符号等（否则事情会非常不灵活）。

使用 lualatex 是否可以直接在数学模式下输入组合变音符号的 utf8 格式？

答案1

答案2

相关内容