我想编写一个宏,根据某些条件将给定字符串中的字符分开。如果我有一个字符串,abcdef<ghi>jkl<mn>opqrs
我想将每个字符作为字符串访问,以便在另一个命令中进一步使用,但ghi
应该是一个字符串,同样mn
应该是一个字符串。你能帮我实现这个吗?
抱歉,我添加了以下问题,我以为一般答案适用于Xelatex
梵文字符,但似乎不行。因此进行了编辑。
我的输入字符串将是天城文字符,然后我将运行Xelatex
例如我的字符串的形式为सा{परेग}नी{धम}पनी
另外,我通过使用Xstring
包对另一个字符串进行操作来获取该字符串,我正在执行以下提取
\StrBetween{(सा{परे}न{धम}पनी)[नी{धम}पनी]}{(}{)}[\firststring]
现在我想用它\firststring
作为输入。
答案1
在这里,我使用 来tokcycle
处理原始输入的标记,并在每个字符标记后添加,
,除非它位于<...>
分组内。然后,我可以使用listofitems
来读取这个以逗号分隔的列表,将每个项目存储在一个数组中\mystring
。此数组可通过项目的索引访问。
如果需要访问每个数组元素的实际标记,则\mystring[...]
需要将调用扩展两次。
\documentclass{article}
\usepackage{listofitems,tokcycle}
\newcounter{nestlevel}
\Characterdirective{\ifx<#1\stepcounter{nestlevel}\else
\ifx>#1\addtocounter{nestlevel}{-1}\addcytoks{,}\else
\addcytoks{#1}\ifnum\thenestlevel=0\relax\addcytoks{,}\fi\fi\fi}
\ignoreemptyitems
\gdef\getmystring#1{%
\setcounter{nestlevel}{0}%
\tokcyclexpress{#1}%
\expandafter\readlist\expandafter\mystring\expandafter{\the\cytoks}}
\begin{document}
\getmystring{abcdef<ghi>jkl<mn>opqrs}
The number of strings is \mystringlen.
\mystring[3], \mystring[6], \mystring[7], \mystring[11], \mystring[16]
\end{document}
如果允许{...}
对输入流进行括号分组,而不是尖括号分组<...>
,则构造会更简单一些,以获得相同的结果:
\documentclass{article}
\usepackage{listofitems,tokcycle}
\stripgroupingtrue
\ignoreemptyitems
\newcommand\getmystring[1]{%
\tokcycle{\addcytoks{##1,}}{\addcytoks{##1,}}{}{}{#1}%
\expandafter\readlist\expandafter\mystring\expandafter{\the\cytoks}}
\begin{document}
\getmystring{abcdef{ghi}jkl{mn}opqrs}
The number of strings is \mystringlen.
\mystring[3], \mystring[6], \mystring[7], \mystring[11], \mystring[16]
\end{document}
补充
OP 评论说他希望使用这种方法,即不是直接提供输入,而是通过\def
ed 字符串提供输入。在这里,我提供\getmydefstring
如何做到这一点:
\documentclass{article}
\usepackage{listofitems,tokcycle}
\stripgroupingtrue
\ignoreemptyitems
\newcommand\getmystring[1]{%
\tokcycle{\addcytoks{##1,}}{\addcytoks{##1,}}{}{}{#1}%
\expandafter\readlist\expandafter\mystring\expandafter{\the\cytoks}}
\newcommand\getmydefstring[1]{%
\def\tmp{\tokcycle{\addcytoks{####1,}}{\addcytoks{####1,}}{}{}}
\expandafter\tmp\expandafter{#1}%
\expandafter\readlist\expandafter\mystring\expandafter{\the\cytoks}}
\begin{document}
\def\mystring{abcdef{ghi}jkl{mn}opqrs}
\getmydefstring{\mystring}
The number of strings is \mystringlen.
\mystring[3], \mystring[6], \mystring[7], \mystring[11], \mystring[16]
\end{document}
答案2
如果您不介意将字符串处理为无分隔参数列表或显式空格标记,我可以提供一个宏\ExtractKthArgOrSpace
。
请注意,无限制参数
- 要么由单个标记组成,该标记既不是显式空格标记,也不是类别代码 1(开始组)的显式字符标记,例如,或 2(结束组),例如,
{1
}2
- 或者由一组嵌套在花括号中的花括号平衡标记组成。
因此,可以使用<
and来代替and 。>
{
}
当使用传统 TeX 引擎(其中内部字符编码方案为 8 位 ASCII)通过 inputenc 包处理 utf8 编码的 .tex 输入时,unicode 字符/包含 unicode 字符的子字符串需要嵌套在花括号中。这是因为在这样的系统上,.tex 输入中的单个 unicode 字符将用于字节编码/8 位编码字符序列,从而产生多个字符标记序列,第一个字符标记处于活动状态并“查看后续字符标记以决定要将哪些字形传送到输出文件”。
\makeatletter
%% Code for \ExtractKthArgOrSpace
%%=============================================================================
%% Paraphernalia:
%% \UD@firstoftwo, \UD@secondoftwo, \UD@PassFirstToSecond, \UD@Exchange,
%% \UD@CheckWhetherNull
%%=============================================================================
\newcommand\UD@firstoftwo[2]{#1}%
\newcommand\UD@secondoftwo[2]{#2}%
\newcommand\UD@PassFirstToSecond[2]{#2{#1}}%
\newcommand\UD@Exchange[2]{#2#1}%
%%-----------------------------------------------------------------------------
%% Check whether argument is empty:
%%.............................................................................
%% \UD@CheckWhetherNull{<Argument which is to be checked>}%
%% {<Tokens to be delivered in case that argument
%% which is to be checked is empty>}%
%% {<Tokens to be delivered in case that argument
%% which is to be checked is not empty>}%
%%
%% The gist of this macro comes from Robert R. Schneck's \ifempty-macro:
%% <https://groups.google.com/forum/#!original/comp.text.tex/kuOEIQIrElc/lUg37FmhA74J>
\newcommand\UD@CheckWhetherNull[1]{%
\romannumeral0\expandafter\UD@secondoftwo\string{\expandafter
\UD@secondoftwo\expandafter{\expandafter{\string#1}\expandafter
\UD@secondoftwo\string}\expandafter\UD@firstoftwo\expandafter{\expandafter
\UD@secondoftwo\string}\expandafter\expandafter\UD@firstoftwo{ }{}%
\UD@secondoftwo}{\expandafter\expandafter\UD@firstoftwo{ }{}\UD@firstoftwo}%
}%
%%=============================================================================
%% Check whether brace-balanced argument's first token is a space-token
%%=============================================================================
%% \UD@CheckWhetherLeadingSpace{<Argument which is to be checked>}%
%% {<Tokens to be delivered in case <argument
%% which is to be checked>'s 1st token is a
%% space-token>}%
%% {<Tokens to be delivered in case <argument
%% which is to be checked>'s 1st token is not
%% a space-token>}%
\newcommand\UD@CheckWhetherLeadingSpace[1]{%
\romannumeral0\UD@CheckWhetherNull{#1}%
{\UD@firstoftwo\expandafter{} \UD@secondoftwo}%
{\expandafter\UD@secondoftwo\string{\UD@@CheckWhetherLeadingSpace.#1 }{}}%
}%
\@ifdefinable\UD@@CheckWhetherLeadingSpace{%
\long\def\UD@@CheckWhetherLeadingSpace#1 {%
\expandafter\UD@CheckWhetherNull\expandafter{\UD@secondoftwo#1{}}%
{\UD@Exchange{\UD@firstoftwo}}{\UD@Exchange{\UD@secondoftwo}}%
{\UD@Exchange{ }{\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter}\expandafter\expandafter
\expandafter}\expandafter\UD@secondoftwo\expandafter{\string}%
}%
}%
%%=============================================================================
%% Remove a space-token
%%=============================================================================
\@ifdefinable\UD@gobblespace{\UD@firstoftwo{\def\UD@gobblespace}{} {}}%
%%=============================================================================
%% Extract K-th element of list of undelimited arguments or spaces:
%%
%% \ExtractKthArgOrSpace{<integer K>}{<list of undelimited args or spaces>}
%%
%% In case there is no K-th element in <list of undelimited args or spaces> :
%% Does not deliver any token.
%% In case there is a K-th element in <list of undelimited args or spaces> :
%% Does deliver that K-th element with one level of surrounding curly
%% braces removed if present.
%%
%% Examples:
%%
%% \ExtractKthArgOrSpace{0}{ABCDE} yields: <nothing>
%%
%% \ExtractKthArgOrSpace{3}{ABCDE} yields: C
%%
%% \ExtractKthArgOrSpace{3}{AB{CD}E} yields: CD
%%
%% \ExtractKthArgOrSpace{3}{AB DE} yields: <space token>
%%
%% \ExtractKthArgOrSpace{2}{{AB} {DE}F} yields: <space token>
%%
%% \ExtractKthArgOrSpace{4}{{001}{002}{003}{004}{005}} yields: 004
%%
%% \ExtractKthArgOrSpace{6}{{001}{002}{003}} yields: <nothing>
%%
%% Due to \romannumeral0-expansion the result is delivered after two expansion-
%% steps/after two "hits" by \expandafter.
%%
%%=============================================================================
\newcommand\ExtractKthArgOrSpace[1]{%
\romannumeral0%
% #1: <integer number K>
\expandafter\UD@ExtractKthArgOrSpaceCheck
\expandafter{\romannumeral\number\number#1 000}%
}%
\newcommand\UD@ExtractKthArgOrSpaceCheck[2]{%
\UD@CheckWhetherNull{#1}{ }{%
\expandafter\UD@ExtractKthArgOrSpaceLoop\expandafter{\UD@firstoftwo{}#1}{#2}%
}%
}%
\newcommand\UD@ExtractKthArgOrSpaceLoop[2]{%
\UD@CheckWhetherNull{#2}{ }{%
\UD@CheckWhetherNull{#1}{%
\UD@CheckWhetherLeadingSpace{#2}{%
\UD@ExtractFirstArgLoop{{ }#2\UD@SelDOm}%
}{%
\UD@ExtractFirstArgLoop{#2\UD@SelDOm}%
}%
}{%
\UD@CheckWhetherLeadingSpace{#2}{%
\expandafter\UD@PassFirstToSecond\expandafter{\UD@gobblespace#2}%
}{%
\expandafter\UD@PassFirstToSecond\expandafter{\UD@firstoftwo{}#2}%
}%
{\expandafter\UD@ExtractKthArgOrSpaceLoop\expandafter{\UD@firstoftwo{}#1}}%
}%
}%
}%
\newcommand\UD@RemoveTillUD@SelDOm{}%
\long\def\UD@RemoveTillUD@SelDOm#1#2\UD@SelDOm{{#1}}%
\newcommand\UD@ExtractFirstArgLoop[1]{%
\expandafter\UD@CheckWhetherNull\expandafter{\UD@firstoftwo{}#1}%
{\UD@firstoftwo{\expandafter}{} \UD@secondoftwo{}#1}%
{\expandafter\UD@ExtractFirstArgLoop\expandafter{\UD@RemoveTillUD@SelDOm#1}}%
}%
%% End of code for \ExtractKthArgOrSpace.
\makeatother
\documentclass{article}
\begin{document}
\noindent The list of undelimited arguments or space-tokens is: \verb|abcdef{ghi}jkl{mn}opqr s|
\bigskip
\noindent
\verb|\ExtractKthArgOrSpace{1}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{1}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{2}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{2}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{3}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{3}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{4}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{4}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{5}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{5}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{6}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{6}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{7}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{7}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{8}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{8}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{9}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{9}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{10}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{10}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{11}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{11}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{12}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{12}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{13}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{13}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{14}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{14}{abcdef{ghi}jkl{mn}opqr s}\\
\verb|\ExtractKthArgOrSpace{15}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{15}{abcdef{ghi}jkl{mn}opqr s}\\
Let's put the following into parentheses to make the space visible:\\
\verb|(\ExtractKthArgOrSpace{16}{abcdef{ghi}jkl{mn}opqr s})|: (\ExtractKthArgOrSpace{16}{abcdef{ghi}jkl{mn}opqr s})\\
\verb|\ExtractKthArgOrSpace{17}{abcdef{ghi}jkl{mn}opqr s}|: \ExtractKthArgOrSpace{17}{abcdef{ghi}jkl{mn}opqr s}
\bigskip
\noindent Space-tokens are taken into account, but be aware that with \LaTeX's tokenizer
consecutive spaces appearing in the .tex-input-file usually collapse into a single explicit space-token
if not skipped completely due to appearing behind something that got tokenized as control-word-token:
\bigskip
\noindent
\verb|(\ExtractKthArgOrSpace{1}{ })|: (\ExtractKthArgOrSpace{1}{ })\\
\verb|(\ExtractKthArgOrSpace{1}{ abc})|: (\ExtractKthArgOrSpace{1}{ abc})\\
\verb|(\ExtractKthArgOrSpace{1}{{ }abc})|: (\ExtractKthArgOrSpace{1}{{ }abc})\\
\verb|(\ExtractKthArgOrSpace{3}{ab c})|: (\ExtractKthArgOrSpace{3}{ab c})\\
\verb|(\ExtractKthArgOrSpace{3}{ab{ }c})|: (\ExtractKthArgOrSpace{3}{ab{ }c})
\bigskip
\noindent
Now let's define a macro after hitting \verb|\ExtractKthArgOrSpace| with \verb|\expandafter| twice:
\begin{verbatim}
\expandafter\expandafter
\expandafter \newcommand
\expandafter\expandafter
\expandafter \test
\expandafter\expandafter
\expandafter {%
\ExtractKthArgOrSpace{7}{abcdef{ghi}jkl{mn}opqr s}%
}%
\texttt{\meaning\test}
\end{verbatim}
\expandafter\expandafter
\expandafter \newcommand
\expandafter\expandafter
\expandafter \test
\expandafter\expandafter
\expandafter {%
\ExtractKthArgOrSpace{7}{abcdef{ghi}jkl{mn}opqr s}%
}%
\noindent
\texttt{\meaning\test}
\end{document}
答案3
在评论中您指出使用{}
而不是<>
可以。
这使得它变得简单,因为 latex 有这样一个循环,所以你不需要包,只需要一个命令来定义在这种情况下每次迭代要做什么,这里我每次都用 [] 包围并结束段落。
\documentclass{article}
\makeatletter
\newcommand\myloop[1]{\@tfor\zzz:=#1\do{[\zzz]\par}}
\makeatother
\begin{document}
\myloop{abcdef{ghi}jkl{mn}opqrs}
\end{document}
或者如果你想保存这些项目,只需在循环中修改命令
\documentclass{article}
\makeatletter
\newcount\mycount
\newcommand\myloop[1]{%
\mycount=0 %
\@tfor\zzz:= #1\do{%
\advance\mycount 1 %
\expandafter\edef\csname zzz\the\mycount\endcsname{\zzz}%
}}
\newcommand\myuse[1]{\csname zzz#1\endcsname}
\makeatother
\begin{document}
\myloop{abcdef{ghi}jkl{mn}opqrs}
7th item is \myuse{7}
11th item is \myuse{11}
13th item is \myuse{13}
\end{document}
答案4
它是带有 的双行代码expl3
。
\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn
\NewExpandableDocumentCommand{\getstringitem}{mm}
{
\tl_item:en { #1 } { #2 }
}
\cs_generate_variant:Nn \tl_item:nn { e }
\ExplSyntaxOff
\begin{document}
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{1},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{2},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{3},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{4},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{5},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{6},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{7},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{8},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{9},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{10},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{11},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{12},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{13},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{14},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{15},
\getstringitem{abcdef{ghi}jkl{mn}opqrs}{16}
\getstringitem{\getstringitem{abcdef{ghi}jkl{mn}opqrs}{7}}{2} should be h
\newcommand{\mystring}{a{bcde}{klm}yz}
\getstringitem{\mystring}{1},
\getstringitem{\mystring}{2},
\getstringitem{\mystring}{3},
\getstringitem{\mystring}{4},
\getstringitem{\mystring}{5}
\end{document}
请注意,您甚至可以嵌套调用并使用存储在宏中的字符串。
您可以通过以下方式获取最后一个元素
\getstringitem{\mystring}{-1}
(当然,明确的字符串也有效)。