对于语言写作,我经常处理输入数据,其中大写字母序列不应排版为大写字母,而应进行特殊处理,例如排版为小写字母。理想情况下,我想要一个提供宏“morphemize”的包,该宏扫描字符串并将用户定义的宏“morpheme”应用于其中任何包含多个大写字母的字符串。想法如下:
\morphemize{ pro.3PERS.FEM be.PAST-HABIT here}
应评估为:
pro.3\morpheme{PERS}.\morpheme{FEM} be.\morpheme{PAST}-\morpheme{HABIT} here
据我所知,目前还没有类似的东西,我需要自己写一个宏。所以,如果能提供类似的东西,我将不胜感激。
答案1
免责声明:我不是语言学家。仅就我的理解而言多种的大写字母 (AZ) 应以不同方式处理。其他所有字母 (单身的大写字母、az、标点符号等)应保持不变。
根据要求,下面的用户界面由\morphemize
和\morpheme
命令组成。此外,我还添加了\xformupper
命令,该命令应用于大写字母字符串中的每个大写字母(这允许\lowercase
在保持可扩展性的同时使用)。
\morphemize
基本流程是首先使用空格将参数拆分\spacesplit@morphemize
,然后将块传递给\@morphemize
大写字母序列识别。这是保留空格所必需的。
在 中\@morphemize
,命令\is@upper
用于测试大写字母。如果连续发现两个大写字母,则\@stringofuppers
使用 来获取传递给 进行\morpheme
处理的大写字母字符串。\@stringoflowers
删除大写字母字符串并使用 处理剩余内容\@morphemize
。
解决方案(`' 添加以证明前后空格被保留):
\documentclass{article}%Answer for this question: http://tex.stackexchange.com/q/298719/89497
\makeatletter
%document-level command to apply \morpheme to strings of uppercase characters (wrapper for \@morphemize)
\newcommand{\morphemize}[1]{\spacesplit@morphemize#1 \nil\unskip}
%document-level command applied to strings of uppercase characters
\newcommand{\morpheme}[1]{\textsc{#1}}
%document-level command applied to individual uppercase characters in strings up uppercase characters
\newcommand{\xformupper}[1]{\lowercase{#1}}
%Recursive command to split at spaces (otherwise they are lost)
\def\spacesplit@morphemize#1 #2\nil{%
\if\relax\detokenize{#1}\relax%preceeding space => eval the #2
\space%
\if\relax\detokenize{#2}\relax%#2 is empty=> do nothing
\else
\spacesplit@morphemize#2\nil%
\fi
\else%
\@morphemize#1\nil%
\if\relax\detokenize{#2}\relax%#2 is empty...do nothing
\else
\space\spacesplit@morphemize#2\nil%
\fi
\fi}
%Recursive command to parse a string (without spaces) based on sequences of uppercase characters
\def\@morphemize#1#2\nil{%
\ifnum\is@upper#1\nil=1\relax%uppercase => treat differently
\if\relax\detokenize{#2}\relax%#2 is empty => treat as a single captial (pass through)
#1%
\else%#2 is not empty => eval if next is capital
\ifnum\is@upper#2\nil=1\relax%next char is uppercase => string of uppercases
\morpheme{\@stringofuppers#1#2\nil}%
\@stringoflowers#1#2\nil%
\else%next char is not uppercase => pass through unchanged
#1\@morphemize#2\nil%
\fi
\fi
\else%lowercase => pass through
#1%
\if\relax\detokenize{#2}\relax%#2 is empty => nothing left to parse
\else%#2 is not empty => more to parse
\@morphemize#2\nil%
\fi
\fi}
%Command to return 1 if the first character is uppercase, 0 if other
\def\is@upper#1#2\nil{%
\ifx#1A 1\else
\ifx#1B 1\else
\ifx#1C 1\else
\ifx#1D 1\else
\ifx#1E 1\else
\ifx#1F 1\else
\ifx#1G 1\else
\ifx#1H 1\else
\ifx#1I 1\else
\ifx#1J 1\else
\ifx#1K 1\else
\ifx#1L 1\else
\ifx#1M 1\else
\ifx#1N 1\else
\ifx#1O 1\else
\ifx#1P 1\else
\ifx#1Q 1\else
\ifx#1R 1\else
\ifx#1S 1\else
\ifx#1T 1\else
\ifx#1U 1\else
\ifx#1V 1\else
\ifx#1W 1\else
\ifx#1X 1\else
\ifx#1Y 1\else
\ifx#1Z 1\else
0%
\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi
\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi
\fi\fi\fi\fi\fi\fi}
%Recursive command expanding to the string of uppercase characters at the beginning of #1#2
\def\@stringofuppers#1#2\nil{%
\ifnum\is@upper#1\nil=1\relax%uppercase => carry on
\if\relax\detokenize{#2}\relax%#2 is empty => add a nill
\xformupper{#1}%
\else%#2 not empty => evaluate next char
\xformupper{#1}\@stringofuppers#2\nil%
\fi
\else\fi}%lowercase => do nothing
%Recursive command expanding to the string of lowercase characters after a string of upper case characters at the beginning of #1#2
%process the remaining contents with \@morphemize
\def\@stringoflowers#1#2\nil{%
\ifnum\is@upper#1\nil=1\relax%uppercase=>carry on
\if\relax\detokenize{#2}\relax%#2 is empty => the whole content was all caps...return nothing
\else%#2 not empty => evaluate next char
\@stringoflowers#2\nil%
\fi
\else%lowercase =>process with \@morphemize
\@morphemize#1#2\nil%
\fi}
\makeatother
\begin{document}
\noindent\verb|\morphemize{ pro.3PERS.FEM be.PAST-HABIT here}|
`\morphemize{ pro.3PERS.FEM be.PAST-HABIT here}'
\noindent\verb|\morphemize{ singleCapiTaL.LETERS here}|
`\morphemize{ singleCapiTaL.LETERS here}'
\noindent\verb|\morphemize{with.A PO.ST-space }|
`\morphemize{with.A PO.ST-space }'
\end{document}
我确信它可以稍微整理一下(expl3
事后看来,这将是一个更好的选择)。此外,它应该是可扩展的,取决于用户定义的\morpheme
和\xformupper
。