自动将 unicode 双下标 aᵢⱼ = a_{i}_{j} 合并为 a_{ij}

自动将 unicode 双下标 aᵢⱼ = a_{i}_{j} 合并为 a_{ij}

我想使用unicode 下标。如何才能使双下标自动合并为一个下标?

\documentclass{standalone}
\usepackage{newunicodechar}
\newunicodechar{ᵢ}{_{i}}
\newunicodechar{ⱼ}{_{j}}
\newunicodechar{ₖ}{_{k}}
\newunicodechar{ₗ}{_{l}}
\begin{document}
\begin{tabular}{l}
$a_{ijkl}$ \\$aᵢⱼₖₗ$  % Error: Double subscript.
\end{tabular}
\end{document}

LuaLatex似乎可以处理它:

\documentclass{standalone}
\usepackage{unicode-math}
\begin{document}
\begin{tabular}{l}
$a_{ijkl}$ \\$aᵢⱼₖₗ$
\end{tabular}
\end{document}

编辑:增加了示例大小,以使可能的空间差异可见。


使用 Jinwen 的建议进行更多测试

\documentclass{standalone}
%\usepackage{unicode-math}
\usepackage{newunicodechar}
\newunicodechar{ⁱ}{{}^{i}}
\newunicodechar{ʲ}{{}^{j}}
\newunicodechar{ᵏ}{{}^{k}}
\newunicodechar{ˡ}{{}^{l}}
\newunicodechar{ᵢ}{{}_{i}}
\newunicodechar{ⱼ}{{}_{j}}
\newunicodechar{ₖ}{{}_{k}}
\newunicodechar{ₗ}{{}_{l}}
\begin{document}

% spacing test subscript
\begin{tabular}{l}
   $a_{ijkl}$ \\ $aᵢⱼₖₗ$
\end{tabular}

% spacing test supscript
\begin{tabular}{l}
   $a^{ijkl}$ \\ $aⁱʲᵏˡ$
\end{tabular}

% comined test
\begin{tabular}{l}
   $a^{i}_{j}$ \\ $aⁱⱼ$
\end{tabular}

% reverse comined test
\begin{tabular}{l}
   $a_{j}^{i}$ \\ $aⱼⁱ$
\end{tabular}

% long sub+supscript
\begin{tabular}{l}
   $a^{ijkl}_{ijkl}$ \\ $aⁱʲᵏˡᵢⱼₖₗ$
\end{tabular}

% multiple sub+supscripts
\begin{tabular}{l}
   $a^{ij}_{kl}$ \\ $aⁱₗʲₗ$   % Error: Double subscript. (fair enough!)
\end{tabular} 

\end{document}

答案1

下面是一个回答你最初问题的方法:将脚本组合在一起。以上标为例,我们有

  • \@unisupA,插入\sp\bgroup在开头;
  • \@unisupB,它检查下一个宏是否为\@unisupA,如果是,则后面还有另一个上标,这种情况下无需执行任何操作;如果不是,则意味着我们已经到达末尾,这种情况下应该插入\egroup
  • 为了使逻辑起作用,还需要一个条件\if@unisup

但是,使用此方法时,不允许混合下标和上标,如上一个示例所示。

\documentclass{standalone}
\usepackage{newunicodechar}
\makeatletter
\newif\if@unisup\@unisupfalse
\newcommand{\@unisupA}{\if@unisup\else\sp\bgroup\fi}
\newcommand{\@unisupB}{\@ifnextchar\@unisupA{\@unisuptrue}{\egroup\@unisupfalse}}
\newunicodechar{ⁱ}{\@unisupA i \expandafter\@unisupB}
\newunicodechar{ʲ}{\@unisupA j \expandafter\@unisupB}
\newunicodechar{ᵏ}{\@unisupA k \expandafter\@unisupB}
\newunicodechar{ˡ}{\@unisupA l \expandafter\@unisupB}
\newif\if@unisub\@unisubfalse
\newcommand{\@unisubA}{\if@unisub\else\sb\bgroup\fi}
\newcommand{\@unisubB}{\@ifnextchar\@unisubA{\@unisubtrue}{\egroup\@unisubfalse}}
\newunicodechar{ᵢ}{\@unisubA i \expandafter\@unisubB}
\newunicodechar{ⱼ}{\@unisubA j \expandafter\@unisubB}
\newunicodechar{ₖ}{\@unisubA k \expandafter\@unisubB}
\newunicodechar{ₗ}{\@unisubA l \expandafter\@unisubB}
\makeatother
\begin{document}

% spacing test subscript
\begin{tabular}{l}
   $a_{ijkl}$ \\ $aᵢⱼₖₗ$
\end{tabular}

% spacing test supscript
\begin{tabular}{l}
   $a^{ijkl}$ \\ $aⁱʲᵏˡ$
\end{tabular}

% comined test
\begin{tabular}{l}
   $a^{i}_{j}$ \\ $aⁱⱼ$
\end{tabular}

% reverse comined test
\begin{tabular}{l}
   $a_{j}^{i}$ \\ $aⱼⁱ$
\end{tabular}

% long sub+supscript
\begin{tabular}{l}
   $a^{ijkl}_{ijkl}$ \\ $aⁱʲᵏˡᵢⱼₖₗ$
\end{tabular}

% multiple sub+supscripts
% \begin{tabular}{l}
%    $a^{ij}_{kl}$ \\ $aⁱₗʲₗ$   % Error: Double subscript. (fair enough!)
% \end{tabular}

\end{document}

以下是自己示例的结果:

在此处输入图片描述


旧答案:

您可以在下标前添加一个空组。

\documentclass{standalone}
\usepackage{newunicodechar}
\newunicodechar{ᵢ}{{}_{i}}
\newunicodechar{ⱼ}{{}_{j}}
\begin{document}$aᵢⱼ$\end{document}

答案2

经过多次尝试和研究,以及深入研究 Unicode 在 8 位引擎中的工作原理后,我找到了一种在 LuaTeX 和 PDFTeX 中均有效的解决方案。关键是使用

\expandafter\futurelet\expandafter\successor\expandafter\check@successor%

存储一次扩展的后继标记。使用 PDFTeX 时,如果后继是 unicode,这将导致它成为\UTFviii@four@octets\UTFviii@three@octets或之一\UTFviii@two@octets。然后我们可以调度一个函数来检查接下来的 1+n 个标记并将它们组合成 Unicode 字符。之后,我们将此字符扩展一次并与 进行比较\subscript

文件:unicode-subscript.sty

% region preamble --------------------------------------------------------------
% IMPLEMENTATION BASED ON \expandafter + \futurelet
%
% Provides public command: `\subscript{arg}`
% Internally uses the namespaces`\usubscript@`
%
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{unicode-subscript}[2024/02/21 Combining Subscripts]
\RequirePackage{iftex}
%
% Usage: \newunicodechar{ᵢ}{\subscript{i}}
% This allows to use multiple unicode subscripts in succession:
% - `xᵢⱼₖ` ⇝ `x\textsubscript{ijk}`
% - `$xᵢⱼₖ$` ⇝ `$x_{ijk}$`
%
% The package is designed to work with both pdftex and luatex.
% Note: Usage of the form `x\subscript{i}\subscript{j}' is not supported.
% endregion preamble -----------------------------------------------------------


% region Package Options -------------------------------------------------------
\newif\ifusubscript@debug\usubscript@debugfalse%  Debug flag
\newif\ifusubscript@testing\usubscript@testingfalse%  Testing flag
\DeclareOption{debug}{\usubscript@debugtrue}
\ProcessOptions\relax%
% endregion Package Options ----------------------------------------------------


% region globals and helper functions ------------------------------------------\
% global subscript list variable
\newcommand{\usubscript@start}{\relax}%  marker for the start of a subscript
\newcommand{\usubscript@list@reset}{\let\usubscript@list=\usubscript@start}
\newcommand{\usubscript@list@append}[1]{\edef\usubscript@list{\unexpanded\expandafter{\usubscript@list#1}}}
\usubscript@list@reset% initialize the list

\newcommand{\usubscript@log}[1]{%
%
% Prints the given message if the debug flag is set.
%
\ifusubscript@debug\PackageInfo{subscript}{#1}\fi%
}

\newcommand{\usubscript@getfirsttok}[2]{%
%
% stores the first token of #2 in #1
%
\def\@extract##1##2\@terminator{\let#1=##1}%
\expandafter\@extract#2\@terminator%
}

% select the correct dispatch function
\ifpdftex%
    \def\usubscript@check@successor{\usubscript@check@successor@pdftex}%
\else%
    \def\usubscript@check@successor{\usubscript@check{\usubscript@successor}}%
\fi%
% endregion globals and helper functions ---------------------------------------


% region public interface ------------------------------------------------------
\newcommand{\subscript}[1]{%
%
% 1. If we are already in a subscript, \subscript appends the given tokens to the \usubscript@list
%    Else, it resets the \subscriptlist
% 2. Executes \usubscript@check@successor which determines if the next character is also a subscript.
%    In this case, we go back to 1, else we stop the process.
%
\ifx\usubscript@list\usubscript@start%
    % Initialize the list with the frst token.
    \usubscript@log{Initializing list with '\meaning#1'}%
    \def\usubscript@list{#1}%
\else%
    % Append token to existing list.
    \usubscript@log{Appending '\meaning#1' to '\usubscript@list'}%
    \usubscript@list@append{#1}%
\fi%
%
% Check the next token to determine whether to continue the subscript or to terminate it
% Expands successor first before \futurelet, this is important to handle unicode in pdftex
\expandafter\futurelet\expandafter\usubscript@successor\expandafter\usubscript@check@successor%
}
% endregion public interface ---------------------------------------------------


% region private implementation ------------------------------------------------
\newcommand{\usubscript@check}[1]{%
%
% Test whether to terminate the subscript
%
\usubscript@log{Testing against '\meaning#1'}%
%
\ifx#1\subscript%
    \usubscript@log{ >>> Successor is another subscript!}%
\else%
    \usubscript@log{ >>> Successor is not a subscript!}%
    \usubscript@finalize%
\fi%
}


\newcommand{\usubscript@finalize}{%
%
% Terminate the subscript and insert the result
%
\usubscript@log{Terminating with current list '\meaning\usubscript@list'}%
%
\ifmmode%
    \usubscript@log{ >>> Inserting '_{\meaning\usubscript@list}{}'}%
    \sb\bgroup\usubscript@list\egroup%
\else%
    \usubscript@log{ >>> Inserting '\textsubscript{\meaning\usubscript@list}'}%
    \textsubscript{\usubscript@list}%
\fi%
%
\usubscript@list@reset%
}


\newcommand{\usubscript@check@successor@pdftex}{%
%
% There are 2 cases we consider:
% 1. The next token is a subscript, in which case we continue the process.
% 2. The next token is some unicode character, in which case:
%   2.1. We grab the necessary number of tokens if using an 8-bit engine
%   2.2. We expand the unicode character once to get the replacement tokens.
%   2.3. We compare the first token of the replacement tokens to the subscript token.
%
\usubscript@log{>>> Dispatching on \meaning\usubscript@successor'}%
%
\ifx\usubscript@successor\UTFviii@four@octets%
    \usubscript@log{ >>> Detected Unicode 4 octets}%
    \def\usubscript@execute{\usubscript@check@unicode@four}%
\else\ifx\usubscript@successor\UTFviii@three@octets%
    \usubscript@log{ >>> Detected Unicode 3 octets}%
    \def\usubscript@execute{\usubscript@check@unicode@three}%
\else\ifx\usubscript@successor\UTFviii@two@octets%
    \usubscript@log{ >>> Detected Unicode 2 octets}%
    \def\usubscript@execute{\usubscript@check@unicode@two}%
\else%
    \usubscript@log{ >>> Detected non-Unicode}%
    \def\usubscript@execute{\usubscript@check{\usubscript@successor}}%
\fi\fi\fi%
%
% dispatch the selected command.
%
\usubscript@execute%
}%


\newcommand{\usubscript@check@unicode@four}[5]{% grabs 1+4 tokens
%
\usubscript@log{>>> Expand Unicode Quadruplet}%
%
\unless\ifcsname u8:#1#2#3#4#5\endcsname%
    \PackageError{subscript}{Detected undefined unicode.}%
\fi%
%
\expandafter\let\expandafter\usubscript@token\csname u8:#1#2#3#4#5\endcsname%
\usubscript@log{Detected unicode '\meaning\usubscript@token'}%
%
\usubscript@getfirsttok{\usubscript@firsttoken}{\usubscript@token}%
\usubscript@check{\usubscript@firsttoken}%
%
\usubscript@log{Reinserting '\meaning#1#2#3#4#5'}%
#1#2#3#4#5%
}


\newcommand{\usubscript@check@unicode@three}[4]{% grabs 1+3 tokens
%
\usubscript@log{>>> Expand Unicode Triplet}%
%
\unless\ifcsname u8:#1#2#3#4\endcsname%
    \PackageError{subscript}{Detected undefined unicode.}%
\fi%
%
\expandafter\let\expandafter\usubscript@token\csname u8:#1#2#3#4\endcsname%
\usubscript@log{Detected unicode '\meaning\usubscript@token'}%
%
\usubscript@getfirsttok{\usubscript@firsttoken}{\usubscript@token}%
\usubscript@check{\usubscript@firsttoken}%
%
\usubscript@log{Reinserting '\meaning#1#2#3#4'}%
#1#2#3#4%
}


\newcommand{\usubscript@check@unicode@two}[3]{% grabs 1+2 tokens
%
\usubscript@log{>>> Expand Unicode Duplet}%
%
\unless\ifcsname u8:#1#2#3\endcsname%
    \PackageError{subscript}{Detected undefined unicode.}%
\fi%
%
\expandafter\let\expandafter\usubscript@token\csname u8:#1#2#3\endcsname%
\usubscript@log{Detected unicode '\meaning\usubscript@token'}%
%
\usubscript@getfirsttok{\usubscript@firsttoken}{\usubscript@token}%
\usubscript@check{\usubscript@firsttoken}%
%
\usubscript@log{Reinserting '\meaning#1#2#3'}%
#1#2#3%
}
% endregion private implementation ---------------------------------------------

\endinput

文件tests.tex

\documentclass{standalone}

\usepackage{newunicodechar}
\usepackage[debug]{unicode-subscript}
\AtBeginDocument{
\newunicodechar{ᵢ}{\subscript{i}}
\newunicodechar{ⱼ}{\subscript{j}}
\newunicodechar{ₖ}{\subscript{k}}
\newunicodechar{ₗ}{\subscript{l}}
\newunicodechar{ₘ}{\subscript{m}}
\newunicodechar{ₙ}{\subscript{n}}
}


\newcommand{\needsfour}[4]{#4#3#2#1}

\begin{document}

$aᵢⱼₖ$

a\subscript{ij$kl$mn}

$a\subscript{ijk}$

% test mathmode
\begin{tabular}{l}
$a_{ijklmn}$
\\ $aᵢⱼₖₗₘₙ$
\\ $a\subscript{ijklmn}$
\end{tabular}

% test textmode
\begin{tabular}{ll}
a\textsubscript{ijklmn}
\\ aᵢⱼₖₗₘₙ
\\ a\subscript{ijklmn}
\end{tabular}

$aᵢ\needsfour4321$

dₘ

\end{document}

结果

在此处输入图片描述

相关内容