是否有一个(合理)有效的宏可以执行类似于
\long\def\comparets#1#2{\def\aa{#1}\def\bb{#2}\ifx\aa\bb true\else false\fi}
except is expandable 的操作(即,\newcomparets{<tokens1>}{<tokens2>}
可以扩展为 'true' 或 'false',包括 inside \edef
)?我正在寻找一个“纯”TeX(即没有扩展,例如 e-TeX)解决方案。我查看过l3tl
宏,但它们似乎使用 e-TeX。该解决方案应该适用于任意标记序列(包括包含各种“有趣空格”和括号以及任意控制序列的序列)。我似乎无法找到一种方法来做到这一点,而无需执行几次传递。
答案1
我不确定将其作为我自己问题的答案发布是否合适,因为它并没有真正回答这个问题,但我不会将其标记为这样(即使假设我可以),所以如果有人灵光一现解决了原始问题,我会很乐意将其标记为真正的答案。
现在谈谈宏。我提前为它们的形状道歉。它们是从我多年来编写的各种代码中提取出来的(并重新重命名),所以风格有点……折衷主义,我们可以这么说。下面可以优化很多,但多次传递的问题仍然存在,我稍后会解释,所以如果有人有巧妙的技巧来解决这个问题,请告诉我。
有几点需要注意:
1) 缺少实际的比较宏,仅存在分析部分,它以“前缀可扩展”的方式(例如,使用技巧\romannumeral-1
)提供第 11 和 12 类标记的字符串,该字符串包含足够的信息来识别序列中的每个标记,包括其类别、字符代码(如果有)、是否为括号、其字符代码等。如果需要,可以直接比较此类字符串。
2)嗯,1)在两个方面都是善意的谎言:
a) 任何可以作为参数抓取的标记(即非空格、非括号)都将被(抓取并)替换为括\meaning
在 t ... e 中的字符串(t 和 e 都是第 11 类);请注意,字符代码不是 32 的第 10 类标记属于此类别(双关语)。\yygrabtokenraw
可以进行调整以提供更好的分析(如果目标是比较任意平衡的标记列表,则必须这样做,但只需归结为几个精心编写的条件)。请注意,仅仅\string
这样做还不够,因为\escapechar
可以是 -1。
b) 缺少“顶层”递归步骤;这里的主要问题是字符代码为 32 的括号;它们在最后一个阶段处理,此时已知序列的长度,并且可以找出\string
它们的每一个\meaning
。好吧,不要这么快,因为如果它们的类别代码为 32,则\meaning
和\string
都会将它们变成普通空格(将以\meaning
两个空格结尾,这也无济于事),这是一个\detokenize
被发明来纠正的问题。因此,我们需要决定如何抓住它们。代码做出的一个保证是,每个左括号都将被正确识别为字符代码 32(o1e
或c1e
)或 32 以外的字符代码(o2e
,c2e
)。执行此操作的代码会弄乱后面的一些右括号(它们的字符代码),以便安全地使用括号,因此c2e
第一个括号后面的“标记”不可靠(但是,如果 找到另一个o1e
,o1e
或,则它是字符代码为 32 的括号)。o2e
下一次迭代可以抓取“解密”的括号,而不会弄乱下一个括号。经过多次迭代(不幸的是,最多有右括号),一切都可以解决。如果有人感兴趣,我可以完成宏来做到这一点。只有当 Knuth\meaning
以点结尾时......
3) 代码花费大量时间“传播扩展”。一种典型的情况是
\somemacro{<long list of benign tokens>}{\string}
;\string
这里的 需要先扩展,然后才能发生其他事情,因此\somemacro
花费大量时间\expandafter
在 中插入 s <long list ...>
。请注意,\romannumeral
如果<long list ...>
很长, 将会失败,因此将所有内容编码为数字不会有帮助。使用\csname <long ...>\endcsname
是可能的(有\expandafter
后续内容),但在这种情况下我担心会污染 TeX 的哈希表。
宏尝试在第一遍中识别“有趣的空间”,这是下面\meaning
和的唯一用途\yymatchblankspace
。只能用\string
。
最后附上了一个宏的测试用例。如果我忽略了一些愚蠢的事情,我深表歉意(当 Joseph Wright 和其他人怀疑时,我往往也会怀疑)。
编辑:除了其他可能与之有关的内容之外,\long
为了清楚起见,我在每个定义前面都省略了,因此\par
会破坏它。
扩展提供更好的分析以上:为了解决病态情况(例如\escapechar=-1 \let\#=#
),可以准备一组宏(每个字符一个(甚至两个),例如\expandafter\def\csname match#\endcsname #1\##{...}% last '#' is \catcode 13
)或几个宏,其中一个\def
ed 负责\def\maintest #1<a list of all active characters and single letter cs's>{...}
所有繁重的工作(通过递归插入“抓取”标记在潜在的“分隔符”中)。中间选项(用时间换取空间)也是可能的。至于“那是很多宏”,当然这是一个问题。我(不完美)对此的看法是:“如果一个人能负担得起那么多\catcode
寄存器,那么他也能负担得起那些特殊的‘条件’。”
我担心扩张传播上面提到的问题只是在 TeX 中进行递归的代价。通过在第一遍中用\yysx ?
where对标记进行编码,可以在一定程度上缓解此问题\def\yysx#1#2{\expandafter\space\expandafter\yysx\expandafter#1\romannumeral-1#2}
。这样,条目\romannumeral-1
列表前面的a\yysx ?
会将扩展“传递”到列表末尾,同时保持完整。
“支架后处理”感觉就像应该是可以避免的。
最后,我被问过很多次“为什么没有 e-TeX?”。我不确定这里是否是讨论这个问题的合适地方,但我有(可能是主观的)理由避免它。如果有人能建议一个更好的地方来讨论这些偏好,我将不胜感激。
% helper macros (to build test cases, etc); @ is a letter
\def\yyreplacestring#1\in#2\with#3{%
\expandafter\def\expandafter\r@placestring\expandafter##\expandafter1\the#1##2\end{%
\def\r@placestring{##2}% is this the string at the very end?
\ifx\r@placestring\empty % then it is the one we inserted, report
\errmessage{string <\the#1> not present in \the#2}% do not change the register if the string is not there
\else % remove the extra copy of #1\end at the end
\expandafter#2\expandafter\expandafter\expandafter
{\expandafter\r@plac@string\expandafter{\the#3}{##1}##2\end}%
\fi}% end of \r@placestring definition
\expandafter\def\expandafter\r@plac@string
\expandafter##\expandafter1%
\expandafter##\expandafter2%
\expandafter##\expandafter3%
\the#1\end{##2##1##3}%
\expandafter\expandafter\expandafter\r@placestring\expandafter\the\expandafter#2\the#1\end
}
\newtoks\toksa
\newtoks\toksb
\newtoks\toksc
\newtoks\toksd
\def\yybreak#1#2\yycontinue{\fi#1}
\def\eatone#1{}
\def\eatonespace#1 {}
\def\identity#1{#1}
\def\yyfirstoftwo#1#2{#1}
\def\yysecondoftwo#1#2{#2}
\def\yysecondofthree#1#2#3{#2}
\def\yythirdofthree#1#2#3{#3}
% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence
\def\yypreparsetokensequenc@#1#2#3{%
\yystringempty{#2}{#1{#3}}{\yypreparsetokensequen@@{#1}{#2}{#3}}%
}
\def\yypreparsetokensequen@@#1#2#3{% remaining sequence is nonempty
\yystartsinbrace{#2}{\yydealwithbracedgroup{#1}{#2}{#3}}{\yypreparsetokensequ@n@@{#1}{#2}{#3}}%
}
\def\yydealwithbracedgroup#1#2#3{% the first token of the remaining sequence is a brace
\iffalse{\fi\yydealwithbracedgro@p#2}{#1}{#3}%
}
\def\yydealwithbracedgro@p#1{%
\yypreparsetokensequenc@{\yyrepackagesequence}{#1}{}%
}
% #1 -- parsed sequence
% this is a sequence to `propagate expansion' into the next parameter.
% the same can be achieved by packaging the whole sequence with a
% \csname ... \endcsname pair and using a simple \expandafter
% maybe that would be a better idea ...
\def\yyrepackagesequence#1{%
\yyrepackagesequenc@{}#1\end
}
% #1 -- `packaged' sequence (\expandafter\expandafter\expandafter ? ...)
% #2 -- the next category 12 character or \end
\def\yyrepackagesequenc@#1#2{%
\ifx#2\end
\yybreak{\yyrepackagesequ@nc@{#1\expandafter\expandafter\expandafter}}%
\else
\yybreak{\yyrepackagesequenc@{#1\expandafter\expandafter\expandafter#2}}%
\yycontinue
}
% #1 -- `packaged' sequence (\expandafter\expandafter\expandafter ? ...)
% this macro is followed by the remainder of the original sequence with a so far
% unmatched right brace, the `call stack' and the parsed sequence.
\def\yyrepackagesequ@nc@#1{%
\expandafter\expandafter\expandafter\yyrepackagesequ@nc@swap#1{\expandafter\eatone\string}%
}
% #1 -- parsed sequence without packaging
\def\yyrepackagesequ@nc@swap#1#{%
\yyrepackagesequ@nc@sw@p{#1}%
}
% #1 -- parsed `inner' sequence
% #2 -- remainder of the original sequence
% #3 -- `call stack'
% #4 -- parsed sequence so far
\def\yyrepackagesequ@nc@sw@p#1#2#3#4{%
\yypreparsetokensequenc@{#3}{#2}{#4[#1]}%
}
% `braced group' thread ends here
% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence
\def\yypreparsetokensequ@n@@#1#2#3{% the remaining group in #2 is nonempty and does not start with a brace
\yystartsinspace{#2}{\yyconsumetruespace{#1}{#2}{#3}}{\yypreparsetokenseq@@n@@{#1}{#2}{#3}}%
}
\def\yyconsumetruespace#1#2#3{%
\expandafter\yyconsumetruespac@swap\expandafter{\eatonespace#2}{#1}{#3.}%
}
\def\yyconsumetruespac@swap#1#2#3{%
\yypreparsetokensequenc@{#2}{#1}{#3}%
}
% `group starting with a true (character code 32, category code 10) space' thread ends here
% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence
\def\yypreparsetokenseq@@n@@#1#2#3{% a nonempty group, that does not start with a brace or a true space
\yymatchblankspace{#2}{\yyrescanblankspace{#2}{#1}{#3}}{\yypreparsetokens@q@@n@@{#1}{#2}{#3}}%
}
% #1 -- remaining sequence
% #2 -- `call stack'
% #3 -- `parsed' sequence
\def\yyrescanblankspace#1#2#3{%
\expandafter\expandafter\expandafter
\yyrescanblankspac@swap
\expandafter\expandafter\expandafter{\expandafter\yynormalizeblankspac@\meaning#1}{#2}{#3*}%
}
\def\yyrescanblankspac@swap#1#2#3{%
\yystartsinspace{#1}{%
\expandafter\yyrescanblankspac@sw@p\expandafter{\eatonespace#1}{#2}{#3}%
}{%
\expandafter\yyrescanblankspac@sw@p\expandafter{\eatone#1}{#2}{#3}%
}%
}
\def\yyrescanblankspac@sw@p#1#2#3{%
\yypreparsetokensequenc@{#2}{#1}{#3}%
}
% `group starting with a blank space' ends here
% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence
\def\yypreparsetokens@q@@n@@#1#2#3{% nonempty group starting with a non blank, non brace token
\expandafter\yypreparsetokens@q@@n@@swap\expandafter{\eatone#2}{#1}{#30}%
}
\def\yypreparsetokens@q@@n@@swap#1#2#3{%
\yypreparsetokensequenc@{#2}{#1}{#3}%
}
% #1 -- string of category code 12 or 10 characters
% #2 -- string of category code 12 or 10 characters
\def\yycomparesimplestrings#1#2{%
\yystringempty{#1}{%
\yystringempty{#2}{\yyfirstoftwo}{\yysecondoftwo}%
}{\yycomparesimplestrings@{#1}{#2}}%
}
\def\yycomparesimplestrings@#1#2{% the first string is nonempty
\yystringempty{#2}{\yysecondoftwo}{\yycomparesimplestrings@@{#1}{#2}}%
}
\def\yycomparesimplestrings@@#1#2{% both strings are nonempty
\yystartsinspace{#1}{%
\yystartsinspace{#2}{\yyabsorbfirstspace{#1}{#2}}{\yysecondoftwo}%
}{%
\yystartsinspace{#2}{\yysecondoftwo}{\yyabsorbfirstnonspace{#1}{#2}}%
}
}
\def\yyabsorbfirstspace#1#2{%
\expandafter\yyabsorbfirstspac@swap\expandafter{\eatonespace#1}{#2}%
}
\def\yyabsorbfirstspac@swap#1#2{%
\expandafter\yyabsorbfirst@swap\expandafter{\eatonespace#2}{#1}%
}
\def\yyabsorbfirstnonspace#1#2{%
\expandafter\yyabsorbfirstnonspac@swap\expandafter{\eatone#1}{#2}%
}
\def\yyabsorbfirstnonspac@swap#1#2{%
\expandafter\yyabsorbfirst@swap\expandafter{\eatone#2}{#1}%
}
\def\yyabsorbfirst@swap#1#2{%
\yycomparesimplestrings{#2}{#1}%
}
% `compare strings of category code 12' thread ends here
% #1 -- remaining parsed sequence
% #2 -- analysed sequence
\def\yyanalysetokens@#1#2{%
\yystringempty{#1}{{#2}}%
{\yyanalysetok@ns@#1\end{#2}}%
}
\def\yyanalysetok@ns@#1#2\end{%
\ifx#1.%
\expandafter\yyfirstoftwo
\else
\expandafter\yysecondoftwo
\fi
{\yygrabablank{#2}}%
{%
\ifx#1[% not a space, an opening brace
\expandafter\yyfirstoftwo
\else
\expandafter\yysecondoftwo
\fi
{%
\yydisableobrace{#2}%
}{%
\ifx#1]% not a space, a closing brace
\expandafter\yyfirstoftwo
\else
\expandafter\yysecondoftwo
\fi
{%
\yydisablecbrace{#2}%
}{% neither space nor brace
\yygrabtokenraw{#2}%
}%
}%
}%
}
% #1 -- remaining parsed sequence
% #2 -- analysed sequence
% #3 -- next token
\def\yygrabtokenraw#1#2#3{%
\expandafter\yyanalysetokens@swap\expandafter{\meaning#3}{#1}{#2}%
}
\def\yyanalysetokens@swap#1#2#3{%
\yyanalysetokens@{#2}{#3t#1e}%
}
\def\yygrabablank#1#2 {%
\yyanalysetokens@{#1}{#2s0e}%
}
% #1 -- remaining parsed sequence
% #2 -- analysed sequence
\def\yydisablecbrace#1#2{%
\yydisablecbrac@{}#1\relax#2\end
}
\def\yydisablecbrac@#1#2{%
\ifx#2\end
\yybreak{\yydisablecbrac@@{#1\expandafter\expandafter\expandafter}}%
\else
\yybreak{\yydisablecbrac@{#1\expandafter\expandafter\expandafter#2}}%
\yycontinue
}
\def\yydisablecbrac@@#1{%
\expandafter\expandafter\expandafter
\yydisablecbrace@@@#1\end
\expandafter\expandafter\expandafter
{\iffalse}\fi\string
}
\def\yydisablecbrace@@@#1\relax#2\end#3{%
\yystartsinspace{#3}%
{\expandafter\yyanalysetok@nsswap\expandafter{\eatonespace#3}{#1}{#2c1e}}%
{\expandafter\yyanalysetok@nsswap\expandafter{\eatone#3}{#1}{#2c2e}}%
}
\def\yyanalysetok@nsswap#1#2#3{%
\iffalse{\fi\yyanalysetokens@{#2}{#3}#1}%
}
% #1 -- remaining parsed sequence
% #2 -- analysed sequence
\def\yydisableobrace#1#2{%
\yydisableobrac@{}#1\relax#2\end
}
\def\yydisableobrac@#1#2{%
\ifx#2\end
\yybreak{\yydisableobrac@@{#1\expandafter\expandafter\expandafter}}%
\else
\yybreak{\yydisableobrac@{#1\expandafter\expandafter\expandafter#2}}%
\yycontinue
}
\def\yydisableobrac@@#1{%
\expandafter\expandafter\expandafter
\yydisableobrace@@@#1\end
\expandafter\expandafter\expandafter
{\iffalse}\fi\string
}
\def\yydisableobrace@@@#1\relax#2\end#3{%
\yystartsinspace{#3}%
{\expandafter\yyanalysetok@nsswap\expandafter{\eatonespace#3}{#1}{#2o1e}}%
{\expandafter\yyanalysetok@nsswap\expandafter{\eatone#3}{#1}{#2o2e}}%
}
\uccode`\ =`\-
% \dotspace expands into a character code `\-, category code 10 token (funny space)
\uppercase{\def\dotspace{ }}
\toksa\expandafter\expandafter\expandafter{\expandafter\meaning\dotspace}
\toksb{-}
\toksc{#2}
\toksd\toksa
\yyreplacestring\toksb\in\toksa\with\toksc
\toksc{}
\yyreplacestring\toksb\in\toksd\with\toksc
\expandafter\def\expandafter\yymatchblankspac@\expandafter#\expandafter1\the\toksd{%
\yystringempty{#1}{\expandafter\yysecondofthree\expandafter{\string}}%
{\expandafter\yythirdofthree\expandafter{\string}}%
}
\edef\yymatchblankspace#1{% is it \catcode 10 token?
\noexpand\iffalse{\noexpand\fi
\noexpand\expandafter
\noexpand\yymatchblankspac@
\noexpand\meaning#1\the\toksd}%
}
% the idea behind the sequence below is that a leading character of category code 10
% is replaced either by a character of category code 10 and charachter code 32 or a character
% of category code 12 and character code other than 32
% note that while it is tempting to replace the definition below by something that ends in
% ... blank space #2{ ... with the hope of absorbing the result of \meaning in one step,
% this will not give the desired result in case of an active character,
% say, `~' that had been \let to the normal blank space
\expandafter\def\expandafter\yynormalizeblankspac@\expandafter#\expandafter1\the\toksd{}
\def\yystartsinspace#1{% is it \charcode 32, \catcode 10 token?
\iffalse{\fi\yystartsinspac@#1 }%
}
\def\yystartsinspac@#1 {%
\yystringempty{#1}{\expandafter\yysecondofthree\expandafter{\string}}{\expandafter\yythirdofthree\expandafter{\string}}%
}
\def\yystartsinbrace#1{%
\iffalse{{\fi
\if!\yytoks@mpty#1}}!%
\expandafter\yysecondoftwo
\else
\expandafter\yyfirstoftwo
\fi
}
\def\yystringempty#1{%
\iffalse{{{\fi
\ifcase\yytoks@mpty#1}}\@ne}\z@
\expandafter\yyfirstoftwo
\else
\expandafter\yysecondoftwo
\fi
}
\def\yytoks@mpty{%
\expandafter\eatone\expandafter{\expandafter{%
\ifcase\expandafter1\expandafter}\expandafter}\expandafter\fi\string
}
%% test code begins here
%\tracingmacros=3
%\tracingonline=3
\catcode`\ =13\relax%
\def\actspace{ }%
\catcode`\ =10\relax%
\catcode`\.=13\relax%
\def\actdotspace{.}%
\catcode`\.=12\relax%
\edef\makefunkydotspace{\let\expandafter\noexpand\actdotspace= \dotspace}
\edef\makefunkyspace{\let\expandafter\noexpand\actspace= \space}
\makefunkyspace
\makefunkydotspace
\catcode`\<=1
\catcode`\>=2
\uccode`\<=32
\uccode`\>=32
% inside the following sequence, < and > will become braces with character code 32 (space),
% \actspace will expand into an active character with character code 32, that has been \let to a
% character code 32, category code 10 token (space)
\uppercase{\edef\temptest{{ } \space\space\dotspace\expandafter\noexpand\actspace\expandafter\noexpand\actdotspace{<> {{}{{ u o l k kk
\end\noexpand\fi\noexpand\else\noexpand\iffalse{}} }}}}}
%\uppercase{\edef\temptest{\dotspace E <>}}
\show\temptest
\def\displaypreparse#1{%
\expandafter\errmessage\expandafter{\romannumeral-1\yypreparsetokensequenc@{\yyanalysetokens@}{#1}{}{}#1}%
}
\expandafter\displaypreparse\expandafter{\temptest}
\end