如何获取下一个 '[' 或 ']' 之前的内容

如何获取下一个 '[' 或 ']' 之前的内容

给出下一个 LaTeX 源代码

\mymacro blablabla[ lorem ipsum
\mymacro blablabla[ lorem ipsum ] etc
\mymacro blablabla] lorem ipsum
\mymacro blablabla] lorem ipsum [ etc

我将如何定义\mymacro2 个参数,使得 #1 是 blablabla,而 #2 在前一种情况下是 '[' 或者在后一种情况下是 ']'。

请注意,拆分成两个不同的宏(一个用于“[”,另一个用于“]”)不是一个选项。那样太简单了。

注意:示例已更新,以更贴切地反映标题中提出的问题。

这个问题可能有点奇怪,因为不常见的latex用法,原来的问题是在LaTeX文档中导入一些由第三方工具创建的文本材料。

答案1

OP 提出了一种非常奇怪的语法,带有未分组的参数。但在这里,我使用标记循环来实现所需的输出。宏\mymacroaux是您需要指定如何处理参数的地方……在这里,我只是回显它们,以便人们可以看到它们被正确消化了。

\documentclass{article}
\usepackage{tokcycle,txfonts}
\makeatletter\let\gobble\@gobble\makeatother
\Characterdirective{%
  \aftertokcycle{\expandafter\mymacroaux\expandafter{\the\cytoks}{#1}}%
  \tctestifx{]#1}{\expandafter\endtokcycraw\gobble}{%
    \tctestifx{[#1}{\expandafter\endtokcycraw\gobble}{\addcytoks{#1}}}%
}
\def\mymacroaux#1#2{(\#1 is ``#1'' and \#2 is ``#2'')}
\let\mymacro\tokencyclexpress
\begin{document}
\mymacro blablabla[ lorem ipsum

\mymacro blablabla[ lorem ipsum ] etc

\mymacro blablabla] lorem ipsum

\mymacro blablabla] lorem ipsum [ etc
\end{document}

在此处输入图片描述

补充

在这里,我尝试让使用 tokcycle 编写自己的令牌拦截例程变得更加用户友好。

我引入了一个宏

\abortiftokenis{<test token>}{<command if not test token>}

可以嵌套,以筛选多个标记,例如[]。必须熟悉 tokcycle 方法,其中输入流中的标记被分流到四个指令之一,即字符、组、宏或空格。因此,捕获宏(命令)必须出现在 中\Macrodirective, 中的空格\Spacedirective和 中的普通字符中\Characterdirective\Groupdirective设置为通过其内容,因为如果沉浸在组中,则无法正确摆脱标记循环。

\aborttokcycle如果出于除匹配令牌之外的原因而希望退出令牌循环,则需要单独定义宏。因此,您会注意到,在这个新版本中,指令变得更加精简。

在此 MWE 中,我将在顶层内容(但不在组内)中\mymacroaux找到[]\today空格时分支到处理程序。回想一下,此处理程序例程采用两个参数:导致被捕获令牌的令牌和导致令牌循环退出的被捕获令牌。

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{tokcycle,txfonts}
\makeatletter
\def\aborttokcycle{\expandafter\endtokcycraw\@gobble}
\def\abortiftokenis#1{%
  \aftertokcycle{\expandafter\mymacroaux\expandafter{\the\cytoks}{#1}}%
  \tctestifx{\tc@next#1}{\aborttokcycle}%
}
\makeatother
\Characterdirective{\abortiftokenis{]}{\abortiftokenis{[}{\addcytoks{#1}}}}
\Groupdirective{\addcytoks{#1}}
\Macrodirective{\abortiftokenis{\today}{\addcytoks{#1}}}
\Spacedirective{\abortiftokenis{ }{\addcytoks{#1}}}
\def\mymacroaux#1#2{(\#1 is ``\detokenize{#1}'' and \#2 is ``\detokenize{#2}'')}
\let\mymacro\tokencyclexpress
\begin{document}
\mymacro b{la b}labla[ lorem ipsum

\mymacro blab\relax labla] lorem ipsum ] etc

\mymacro blab\today labla] lorem ipsum [ etc

\mymacro blabla bla] lorem ipsum
\end{document}

在此处输入图片描述

答案2

使用最新的 LaTeX 系统,您可以使用\peek_regex_replace_once:nn

\documentclass{article}

\ExplSyntaxOn
\NewDocumentCommand{\mymacro}{}
 {
  \peek_regex_replace_once:nn { ([^\[\]]*) (\[|\]) } { \c{innermymacro}\cB\{\1\cE\}\2 }
 }
\ExplSyntaxOff

\newcommand{\innermymacro}[2]{%
  First argument: ``#1''; second argument: ``#2''%
}


\begin{document}

\mymacro blablabla[ lorem ipsum

\mymacro bla bla bla[ lorem ipsum ] etc

\mymacro blablabla] lorem ipsum

\mymacro bla bla bla] lorem ipsum [ etc

\end{document}

解释:

  1. 正则表达式([^\[\]]*) (\[|\])查找所有标记,直到找到[] 或],并将找到的结果保存为\1(括号内的标记)和\2(括号内);

  2. 替换文本为\c{innermymacro}\cB\{\1\cE\}\2,意思是“放入\myinnermacro、一个括号{、由 表示的标记\1、一个括号}和由 表示的标记\2(在本例中为单个标记,即[]);

  3. 接下来,处理将从\myinnermacro可以使用其两个给定参数的地方重新开始。

在此处输入图片描述

答案3

也许我对这个问题的解释是正确的,也许不是,但至少这可能有助于澄清它。

\documentclass{article}
\makeatletter
\def\mymacro#1]{\edef\mymacro@tmp{\noexpand\in@{[}{#1}}%
\mymacro@tmp
\ifin@
\expandafter\mymacro@i#1%
\else
\#1=#1,\#2=]%
\fi}
\def\mymacro@i#1[#2]{\#1=#1,\#2=[}
\makeatother
\begin{document}
\mymacro blablabla[ lorem ipsum ] etc

\mymacro blablabla] lorem ipsum [ etc
\end{document}

在此处输入图片描述

答案4

我可以提供一个宏

\SplitAtSquareBracketAndPassToMacro{⟨macro A⟩}{⟨macro B⟩}{⟨tokens⟩}
其工作原理如下:

如果⟨tokens⟩不包含任何未嵌套在花括号中的类别代码 12(其他)的方括号,则

⟨macro B⟩{⟨tokens⟩}

已送达。

如果⟨tokens⟩确实包含至少一个类别代码为 12(其他)的方括号,且该方括号未嵌套在花括号中,则

⟨macro A⟩{⟨tokens before first square bracket⟩}{⟨square bracket⟩}{⟨tokens behind first square bracket⟩}

已送达。

首先,在\romannumeral-扩展驱动的尾递归循环中,测试⟨remaining tokens⟩-参数

  • 是空的
  • 或有一个领先的或[12]12
  • 或具有领先的明确空间标记。

如果为空,则交付。⟨macro B⟩{⟨tokens⟩}

如果它有一个前导或,那么将调用一个宏来处理-delimited 参数相应的-delimited 参数,以进行相应的拆分。[12]12[12]12⟨tokens⟩

如果它具有前导显式空格标记,则将其删除并进行另一次循环迭代。

如果两者都不是,则删除非分隔参数并进行另一次循环迭代。

⟨macro A⟩您可以将参数传递\mymacro\SplitAtSquareBracketAndPassToMacro

\documentclass{article}
\makeatletter
%%=============================================================================
%% Paraphernalia:
%%    \UD@firstoftwo, \UD@secondoftwo,
%%    \UD@PassFirstToSecond, \UD@Exchange, \UD@removespace
%%    \UD@CheckWhetherNull, \UD@CheckWhetherLeadingTokens, 
%%    \UD@ExtractFirstArgLoop
%%=============================================================================
\newcommand\UD@firstoftwo[2]{#1}%
\newcommand\UD@secondoftwo[2]{#2}%
\newcommand\UD@PassFirstToSecond[2]{#2{#1}}%
\newcommand\UD@Exchange[2]{#2#1}%
\newcommand\UD@removespace{}\UD@firstoftwo{\def\UD@removespace}{} {}%
%%-----------------------------------------------------------------------------
%% Check whether argument is empty:
%%.............................................................................
%% \UD@CheckWhetherNull{<Argument which is to be checked>}%
%%                     {<Tokens to be delivered in case that argument
%%                       which is to be checked is empty>}%
%%                     {<Tokens to be delivered in case that argument
%%                       which is to be checked is not empty>}%
%%
%% The gist of this macro comes from Robert R. Schneck's \ifempty-macro:
%% <https://groups.google.com/forum/#!original/comp.text.tex/kuOEIQIrElc/lUg37FmhA74J>
\newcommand\UD@CheckWhetherNull[1]{%
  \romannumeral\expandafter\UD@secondoftwo\string{\expandafter
  \UD@secondoftwo\expandafter{\expandafter{\string#1}\expandafter
  \UD@secondoftwo\string}\expandafter\UD@firstoftwo\expandafter{\expandafter
  \UD@secondoftwo\string}\expandafter\z@\UD@secondoftwo}%
  {\expandafter\z@\UD@firstoftwo}%
}%
%%-----------------------------------------------------------------------------
%% Check whether argument's leading tokens form a specific 
%% token-sequence that does neither contain explicit character tokens of 
%% category code 1 or 2 nor contain tokens of category code 6:
%%.............................................................................
%% \UD@CheckWhetherLeadingTokens{<argument which is to be checked>}%
%%                              {<a <token sequence> without explicit 
%%                                character tokens of category code
%%                                1 or 2 and without tokens of
%%                                category code 6>}%
%%                              {<internal token-check-macro>}%
%%                              {<tokens to be delivered in case
%%                                <argument which is to be checked> has
%%                                <token sequence> as leading tokens>}%
%%                              {<tokens to be delivered in case 
%%                                <argument which is to be checked>
%%                                does not have <token sequence> as
%%                                leading tokens>}%
\newcommand\UD@CheckWhetherLeadingTokens[3]{%
  \romannumeral\UD@CheckWhetherNull{#1}{\expandafter\z@\UD@secondoftwo}{%
    \expandafter\UD@secondoftwo\string{\expandafter
    \UD@@CheckWhetherLeadingTokens#3{\relax}#1#2}{}}%
}%
\newcommand\UD@@CheckWhetherLeadingTokens[1]{%
  \expandafter\UD@CheckWhetherNull\expandafter{\UD@firstoftwo{}#1}%
  {\UD@Exchange{\UD@firstoftwo}}{\UD@Exchange{\UD@secondoftwo}}%
  {\expandafter\expandafter\expandafter\expandafter
   \expandafter\expandafter\expandafter\z@\expandafter\expandafter
   \expandafter}\expandafter\UD@secondoftwo\expandafter{\string}%
}%
%%-----------------------------------------------------------------------------
%% Extract first inner undelimited argument:
%%
%%   \romannumeral\UD@ExtractFirstArgLoop{ABCDE\UD@SelDOm} yields  {A}
%%
%%   \romannumeral\UD@ExtractFirstArgLoop{{AB}CDE\UD@SelDOm} yields  {AB}
%%.............................................................................
\@ifdefinable\UD@RemoveTillUD@SelDOm{%
  \long\def\UD@RemoveTillUD@SelDOm#1#2\UD@SelDOm{{#1}}%
}%
\newcommand\UD@ExtractFirstArgLoop[1]{%
  \expandafter\UD@CheckWhetherNull\expandafter{\UD@firstoftwo{}#1}%
  {\z@#1}%
  {\expandafter\UD@ExtractFirstArgLoop\expandafter{\UD@RemoveTillUD@SelDOm#1}}%
}%
%%-----------------------------------------------------------------------------
%% \UD@internaltokencheckdefiner{<internal token-check-macro>}%
%%                              {<token sequence>}%
%% Defines <internal token-check-macro> to snap everything 
%% until reaching <token sequence>-sequence and spit that out
%% nested in braces.
%%-----------------------------------------------------------------------------
\newcommand\UD@internaltokencheckdefiner[2]{%
  \@ifdefinable#1{\long\def#1##1#2{{##1}}}%
}%
%%=============================================================================
%% Supplementary macros for \SplitAtSquareBracketAndPassToMacro 
%% and \SplitAtSquareBracketAndPassToMacro 
%%=============================================================================
\UD@internaltokencheckdefiner{\UD@InternalExplicitSpaceCheckMacro}{ }%
\UD@internaltokencheckdefiner{\UD@InternalLeftSquaeBracketCheckMacro}{[}%
\UD@internaltokencheckdefiner{\UD@InternalRightSquaeBracketCheckMacro}{]}%
\@ifdefinable\UD@SplitAtLeftSquareBracket{%
  \long\def\UD@SplitAtLeftSquareBracket#1[{\expandafter\z@\expandafter{\UD@firstoftwo{}#1}{[}}%
}%
\@ifdefinable\UD@SplitAtRightSquareBracket{%
  \long\def\UD@SplitAtRightSquareBracket#1]{\expandafter\z@\expandafter{\UD@firstoftwo{}#1}{]}}%
}%
\newcommand\UD@SplitAtSquareBracket[3]{%
  \expandafter\UD@PassFirstToSecond\expandafter{%
     \romannumeral
     \expandafter\expandafter\expandafter\z@\expandafter\UD@firstoftwo\expandafter{\expandafter}%
     \romannumeral
     \expandafter\expandafter\expandafter\z@\expandafter\UD@firstoftwo\expandafter{\expandafter}%
     \romannumeral#3#1%
  }{%
    \expandafter\UD@PassFirstToSecond
    \romannumeral\expandafter\expandafter\expandafter\UD@ExtractFirstArgLoop
                 \expandafter\expandafter\expandafter{%
                 \expandafter\UD@firstoftwo\expandafter{\expandafter}%
                 \romannumeral#3#1\UD@SelDOm}{%
      \expandafter\UD@PassFirstToSecond
      \romannumeral\expandafter\UD@ExtractFirstArgLoop\expandafter{%
                   \romannumeral#3#1\UD@SelDOm}{%
        \z@#2% 
      }%
    }%
  }%
}%
\newcommand\SplitAtSquareBracketAndPassToMacro[3]{%
  \romannumeral\UD@SplitAtSquareBracketAndPassToMacroLoop{#3}{#3}{#1}{#2}%
}%
\newcommand\UD@SplitAtSquareBracketAndPassToMacroLoop[4]{%
  % #1 = <remaining tokens>
  % #2 = <tokens>
  % #3 = <macro A>
  % #4 = <macro B>
  \UD@CheckWhetherNull{#1}{\z@#4{#2}}{%
    \UD@CheckWhetherLeadingTokens{#1}{ }{\UD@InternalExplicitSpaceCheckMacro}{%
      \expandafter\UD@SplitAtSquareBracketAndPassToMacroLoop\expandafter{\UD@removespace#1}{#2}{#3}{#4}%
    }{%
      \UD@CheckWhetherLeadingTokens{#1}{[}{\UD@InternalLeftSquaeBracketCheckMacro}{%
         \UD@SplitAtSquareBracket{.#2}{#3}{\UD@SplitAtLeftSquareBracket}%
      }{%
        \UD@CheckWhetherLeadingTokens{#1}{]}{\UD@InternalRightSquaeBracketCheckMacro}{%
          \UD@SplitAtSquareBracket{.#2}{#3}{\UD@SplitAtRightSquareBracket}%
        }{%
          \expandafter\UD@SplitAtSquareBracketAndPassToMacroLoop\expandafter{\UD@firstoftwo{}#1}{#2}{#3}{#4}%
        }%
      }%
    }%
  }%
}%
\makeatother

%%=============================================================================
%% \mymacro{<tokens 1>}{<tokens 2>}{<tokens 3>}
%% Whan arguments are passed to  \mymacro from 
%% \SplitAtSquareBracketAndPassToMacro, then 
%% - <tokens 1> is the things before the first [ respective ] .
%% - <tokens 2> is either [ or ] .
%% - <tokens 3> is the things behind the first [ respective ] .
%%=============================================================================
\newcommand\mymacro[3]{%
  \noindent
  \scantokens\expandafter\expandafter\expandafter{%
             \expandafter\string\expandafter\verb\expandafter|\string\mymacro|:%
  }%
  Argument 1 is: \scantokens\expandafter{\string\verb|#1|.}%
  Argument 2 is: \scantokens\expandafter{\string\verb|#2|.}%
  Argument 3 is: \scantokens\expandafter{\string\verb|#3|.}%
 }%
%%=============================================================================
%% macro in case there was no square bracket
%%=============================================================================
\newcommand\nosquarebracketsmacro[1]{%
  \noindent
  \scantokens\expandafter\expandafter\expandafter{%
             \expandafter\string\expandafter\verb\expandafter|%
             \string\nosquarebracketsmacro|:%
  }%
  Argument is: \scantokens\expandafter{\string\verb|#1|.}%
}%

\parindent=0ex

\begin{document}

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB]CDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB]CDE}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB[CDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB[CDE}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{ABCDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{ABCDE}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}]CDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}]CDE}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}[CDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}[CDE}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}CDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}CDE}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB]{CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB]{CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB[{CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB[{CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB{CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB{CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}]{CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}]{CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}[{CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}[{CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}{CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB}{CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB]CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB]CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB[CDE}}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{AB[CDE}}

\vfill

\verb|\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{AB]CDE}|:\\
\SplitAtSquareBracketAndPassToMacro{\mymacro}{\nosquarebracketsmacro}{{ABCDE}}

\end{document}

在此处输入图片描述

就问题中的任务/宏观写作挑战而言,我承认我作弊了一点:

通过以这样一种方式定义宏\SplitAtSquareBracketAndPassToMacro,将要拆分的标记作为宏参数传递给它,我避免了直接从标记流中“抓取”这些标记的需要。

我这样做是因为从原则上讲,标记只能作为宏参数从标记流中“取出”。
例如,由多个标记组成的未限定宏参数嵌套在由类别代码 1(开始组)的显式字符标记和类别代码 2(结束组)的显式字符标记组成的标记对中。这对标记构成了所谓的参数组。
通常,参数组由字符标记和组成,但不排除由于某些不为人知的原因,类别代码制度不同,并且这些类别代码的其他字符正在使用中。{1}2
我不知道有什么可靠的方法可以从标记流中取出未限定的宏参数,其中括号或类别代码 1 和 -2 字符标记构成嵌套未限定参数的参数组完全保留,而不是被一些“硬编码”对替换类别代码为 1/2 的显式字符标记。
您可以让 TeX 通过\futurelet\ifcat来“查找”标记流中下一个标记的类别代码。但首先,表示参数组开始的类别代码 1 字符标记不能仅通过“查找”来找出,其次,表示参数组结束的类别代码 2 字符标记不是标记流中的下一个标记……

相关内容