以完全稳健的方式访问令牌列表中的第一个项目(可扩展)

以完全稳健的方式访问令牌列表中的第一个项目(可扩展)

给定一个标记列表(例如\a\b\c或 ){ab}c,我将第一个项目定义为\@gobble将作为其参数的内容(回想一下定义\long\def\@gobble#1{})。设计一个从标记列表中提取第一个项目的宏并不难,例如,将其包装在 eTeX 的 中\unexpanded

\begingroup
  \catcode`@=11
  \long\gdef\firstofmany#1{\@firstofmany#1\@marker}
  \long\gdef\@firstofmany#1#2\@marker{\unexpanded{#1}}
\endgroup
\message{"\firstofmany{\a\b\c}"} % => "\a "
\message{"\firstofmany{ { ab} c}"} % => " ab"

但是,如果标记列表包含标记(我选择的是),则此宏将失败\@marker。是否可以编写此宏的变体,使其适用于任意标记列表?(我不关心仅由空格组成的标记列表会产生什么,因此没有第一个项目。)

编辑:我应该更清楚地说明解决方案不是生成不太常见的分隔符。我想要一个\firstofmany不会因自身定义(当然还有辅助定义)的任何部分而受阻的函数。

答案1

有了\expanded原语,就可以使用更快的替代方案,它与提供的版本一样稳定@BrunoLeFloch

它还使用不平衡的右括号作为右分隔符,但由于它使用\expanded原语,因此不必确保宏始终位于输入的左侧以确保 f 型扩展中的安全。

\begingroup
  \catcode`@=11
  \long\gdef\firstofmany#1%
    {%
      \expanded{\iffalse{\fi\firstofmany@aux#1{}}}%
    }
  \long\gdef\firstofmany@aux#1%
    {%
      \unexpanded{#1}%
      \expandafter\firstofmany@gobble\expandafter{\iffalse}\fi
    }
  \long\gdef\firstofmany@gobble#1{}
\endgroup

\long\def\test#1%
  {\message{|\unexpanded\expandafter{\romannumeral-`q#1}|}}

\test{\firstofmany{ a bc}}
\test{\firstofmany{ {a\a} bc}}
\test{\firstofmany{ {a\a} abc abc abc}}
\test{\firstofmany{ }}

\csname stop\endcsname
\csname bye\endcsname

为了获得与 Bruno 的答案中的代码相同的行为,可以\unexpanded像这样在其周围放置另一个代码:

\begingroup
  \catcode`@=11
  \long\gdef\firstofmany#1%
    {%
      \unexpanded\expanded{{\iffalse{\fi\firstofmany@aux#1{}}}}%
    }
  \long\gdef\firstofmany@aux#1%
    {%
      \unexpanded{#1}%
      \expandafter\firstofmany@gobble\expandafter{\iffalse}\fi
    }
  \long\gdef\firstofmany@gobble#1{}
\endgroup

\long\def\test#1%
  {\message{|\unexpanded\expandafter{\romannumeral-`q#1}|}}

\test{\firstofmany{ a bc}}
\test{\firstofmany{ {a\a} bc}}
\test{\firstofmany{ {a\a} abc abc abc}}
\test{\firstofmany{ }}

\csname stop\endcsname
\csname bye\endcsname

答案2

如果您允许一些 pdftex 原语,我认为您可以这样做,它使用整个输入列表作为标记。

\begingroup
  \catcode`@=11
  \long\gdef\firstofmany#1{%
    \@fom{\unexpanded{[#1]}}#1{[#1]}}

  \long\gdef\@fom#1#2{%
   \unexpanded{#2}%
   \@gobbleto{#1}}

\gdef\@gobbleto#1#2{%
  \ifnum\pdfstrcmp{\unexpanded{#2}}{#1}=\z@
  \expandafter\@gobbletwo
  \else
  \fi
  \@gobbleto{#1}}

\gdef\@gobbletwo#1#2{}

\endgroup



\message{"\firstofmany{\a\b\c}"} % => "\a "
\message{"\firstofmany{ { ab} c}"} % => " ab"


\bye

答案3

编辑:非常多更短。我之前在想什么?还是我现在很困?:-)

\string使用和剥去左括号\gobble,取出第一个项目,然后将括号放回去。

\catcode`@=11
\def\@gobble#1{}

\def\firstofmany{\expandafter\expandafter\expandafter
                 \fom@getfirst\expandafter\@gobble\string}
\def\fom@getfirst#1{\unexpanded{#1}\fom@gobble}
\def\fom@gobble{\expandafter\expandafter\expandafter
                     \expandafter\expandafter\expandafter
                     \expandafter\@gobble\iftrue\expandafter{\else}\fi}

\message{"\firstofmany{\a\b\c}"}

\bye

但这里缺少对初始组的正确处理(它们找到了,但没有写出括号)。


思考我可能找到了一个纯 eTeX 解决方案。我用我能想到的所有方法尝试了它,它似乎有效……除了空白列表(错误)和以空格开头的列表(空格被忽略)。但无论如何,上面提到的这些都不重要。

我不知道如何提高速度,虽然我不是专家,但事情相当复杂……

在给出代码之前,先进行一个概念性的概述。

  1. 该列表已去标记化。(因此仅适用于 eTeX。)

  2. 在去标记化版本中,括号组被计数。(这部分可以通过使用 TeX 的宏参数解析机制进行优化。但在这个阶段,我是为了清晰度而不是速度而实现的。)假设{}(并且只有这些)有 catcode 1 和 2,但我相信这可以很容易地推广。

    *外部组的数量以适当长度的 s列表的形式传递。

    好的,这是最简单的部分:-)

  3. 这个想法是使用 来拆解组\string:将左括号字符串化,然后吞掉。然而,问题是如何扩展 and \string\gobble我们*基于 的“计数器”挡住了路……(顺便说一句,在我看来,将计数器传递出去(作为参数列表的一部分)完全不可能解除分组列表,因为我们不想使用固定分隔符。)

    解决方案的一部分是\let*\expandafter。我们需要在 -counter 后扩展两个宏*,这样我们就会穿过星星两次,所以其中的 1/4 将保留下来。但是当我们将计数器“乘以”四时,一切都很好。:-)

  4. 拆散群组后,我们可以轻松访问第一个项目。确实,我们需要对群组等第一个项目多加小心,但总的来说,这部分更繁琐而不是创新。

  5. 魔法剩下的唯一部分就是吞噬。我们交替吞噬外层组和它们之间的标记。因为我们知道有多少外层组,所以我们知道什么时候停止,所以我们不会遇到现在孤独的右括号(当然,我们最终会为他提供一个伙伴)。

    \def\gobble...#{我们使用技巧 (TeXbook p.204)吞噬外部组之间的标记。

\catcode`@=11

\def\afterfi#1#2\fi{\fi#1}
% use \onefi etc after these
\def\afterfifi#1#2#3\fi#4\fi{#1#2}
\def\afterfififi#1#2#3\fi#4\fi#5\fi{#1#2}
\def\afterfifififi#1#2#3\fi#4\fi#5\fi#6\fi{#1#2}
\def\onefi{\fi}
\def\twofi{\fi\fi}
\def\threefi{\fi\fi\fi}
\def\fourfi{\fi\fi\fi\fi}
\def\gobble#1{}

\def\openingbrace{\iftrue{\else}\fi}
\def\closingbrace{\iffalse{\else}\fi}

% Detokenize (while preserving the original)
\long\def\firstofmany#1{%
  \expandafter\fom@countfirstlevelgroups\detokenize{#1}de{}{}{#1}%
}

\catcode`*=13  % we'll be counting stars
\def\if@zero\if#1#2/{%   % zero test
  \ifx#1/%
    \afterfi{\if@zero@yes}%
  \else
    \afterfi{\if@zero@no}%
  \fi
}
\def\if@zero@yes{\iftrue}
\def\if@zero@no/{\iffalse}

{\catcode`(=1 \catcode`)=2 (\catcode`{=12 \catcode`}=12

\xdef\detok@openingbrace({)%

% Count the number of outer brace pairs
%
% Note 1: This macro is very non-optimized... it should use TeX's macro
% argument parsing mechanism to search for { and }, and shouldn't use
% all these \afterfi-s, I used this approach just for clarity.
%
% Note 2: This macro expects precisely { and } to be of catcode 1 and 2.
% This could be fixed, but it's not worth the effort at this point.
%
% Args: #1#2 = detokenized, #3 = n, #4 = depth
% --> letters are safe delimiters, because \detokenize produces `other's
% We save the very first token for later (#5 below).
\gdef\fom@countfirstlevelgroups#1#2e#3#4(%  
  \fom@countfirstlevelgroups@#1#2e(#3)(#4)#1% 
)
\gdef\fom@countfirstlevelgroups@#1#2e#3#4#5(% 
  \ifx#1d% end of detokenized string
    \afterfifififi(\onefi)(\fom@removeopeningbrace#5(#3))%
  \else
    \ifx#1{% { found ==> increase depth
      \if@zero\if#4//% { found at zero depth ==> increase n 
        \afterfifififi(\threefi)(\fom@countfirstlevelgroups@#2e(#3*)(#4*)#5)%
      \else
        \afterfifififi(\threefi)(\fom@countfirstlevelgroups@#2e(#3)(#4*)#5)%
      \fi
    \else
      \ifx#1}% } found => decrease depth
        \afterfififi(\threefi)(\fom@cflg@decreasedepth#2e(#3)[#4]#5)%
      \else % neither { not } found ==> go to next char
        \afterfififi(\threefi)(\fom@countfirstlevelgroups@#2e(#3)(#4)#5)%
      \fi
    \fi
  \fi
)
\gdef\fom@cflg@decreasedepth#1e#2[#3#4]#5(%
  \fom@countfirstlevelgroups@#1e(#2)(#4)#5)

)}  % back to normal braces

% Remove the initial brace.
% *s are quadrapled to expand first \string (followed by }, we know)
% and \gobble, thus destroying the group; we will be left with the
% original number of *s
\let*\expandafter
\def\fom@removeopeningbrace#1#2{% #2=***** (n), #1=the first *token*
  \expandafter\expandafter\expandafter\fom@adddummy
  \expandafter\expandafter\expandafter#1%
  #2#2#2#2\expandafter\expandafter\expandafter e%
  \expandafter\gobble\string
}

% Insert a dummy group (and a *) after the first item. We will
% start gobbling by gobbling to a group and this would fail if there
% were none.  This needs to be done before checking for group below,
% so that we have enough *s.
\long\def\fom@adddummy#1#2e#3{%
  \fom@checkforgroup#1#2*e{#3}{}%
}

% Group as the first item requires special attention.  (Note: space
% would need it as well, but space never get here anyway: it
% dissapears when \fom@countfirstlevelgroups is expanded.)
% #1 = the first token of the detokenized first item (will be now
% finally discarded)
% #2#3 = *s (if we will find an opening brace, one * will be removed)
\def\fom@checkforgroup#1#2#3e{%
  \if\detok@openingbrace#1%
    \afterfi{\fom@havegroup#3e}%
  \else
    \afterfi{\fom@getfirstitem#2#3e}%
  \fi
}
% Put extra braces around the first item which is a group.
\long\def\fom@havegroup#1e#2{\fom@getfirstitem#1e{{#2}}}

% Get the first item, then call the gobblers: insert two markers
% instead of one, the gobblers need them.
\long\def\fom@getfirstitem#1e#2{%
  \unexpanded{#2}%
  \fom@gobbletogroup#1*ef{}%
}

% Gobble: we know how many groups we have (as many as *s), so we
% can gobble by alternating \fom@gobbletogroup...#3#{...}
% and \fom@gobblegroup...#3{...}
% #1#2=*s, #3=toks before group; but first check if there are any
% *s left!
\def\fom@gobbletogroup#1#2f{%
  \ifx#1e%
    \afterfi\fom@finish
  \else
    \afterfi{\fom@gobbletogroup@#1#2f}%
  \fi
}
\long\def\fom@gobbletogroup@#1#2f#3#{%
  \fom@gobblegroup#1#2f%
}

% #1#2=*s, #3=the group
\def\fom@gobblegroup#1#2f{%
  \ifx#1e%
    \afterfi\fom@finish
  \else
    \afterfi{\fom@gobblegroup@#1#2f}%
  \fi
}
\long\def\fom@gobblegroup@#1#2f#3{%
  \fom@gobbletogroup#2f%
}

\def\fom@finish{%
  \iftrue\expandafter\fom@finish@\expandafter{\else}\fi
}
\long\def\fom@finish@#1{}

% TEST:
\message{"\firstofmany{#1\fom@gobblegroup{\par #1  # @@@ef

aa}a**aa{first} l{ine{%
\fom@gobblegroup\fi\fi
}s}econd line} efef "}

\bye

答案4

分隔符似乎是必要的,因为你不知道要丢弃多少项。一种解决方法可能是插入一个分隔符,它是非常不太可能出现在现实世界的文档中:

\begingroup
  \catcode`@=11
  \edef\funny{\detokenize{&${}$&}}
  \long\xdef\firstofmany#1{\noexpand\@firstofmany#1\funny}
  \edef\x{\long\gdef\noexpand\@firstofmany##1##2\funny}\x{\unexpanded{#1}}
\endgroup
\message{"\firstofmany{\a\b\c}"} % => "\a "
\message{"\firstofmany{ { ab} c}"} % => " ab"

尽量\funny复杂一点。空的标记列表或仅由空格组成的列表恐怕会出错。

相关内容