大写字母序列特殊处理包

大写字母序列特殊处理包

对于语言写作,我经常处理输入数据,其中大写字母序列不应排版为大写字母,而应进行特殊处理,例如排版为小写字母。理想情况下,我想要一个提供宏“morphemize”的包,该宏扫描字符串并将用户定义的宏“morpheme”应用于其中任何包含多个大写字母的字符串。想法如下:

\morphemize{ pro.3PERS.FEM be.PAST-HABIT here}

应评估为:

pro.3\morpheme{PERS}.\morpheme{FEM} be.\morpheme{PAST}-\morpheme{HABIT} here

据我所知,目前还没有类似的东西,我需要自己写一个宏。所以,如果能提供类似的东西,我将不胜感激。

答案1

免责声明:我不是语言学家。仅就我的理解而言多种的大写字母 (AZ) 应以不同方式处理。其他所有字母 (单身的大写字母、az、标点符号等)应保持不变。

根据要求,下面的用户界面由\morphemize\morpheme命令组成。此外,我还添加了\xformupper命令,该命令应用于大写字母字符串中的每个大写字母(这允许\lowercase在保持可扩展性的同时使用)。

\morphemize基本流程是首先使用空格将参数拆分\spacesplit@morphemize,然后将块传递给\@morphemize大写字母序列识别。这是保留空格所必需的。

在 中\@morphemize,命令\is@upper用于测试大写字母。如果连续发现两个大写字母,则\@stringofuppers使用 来获取传递给 进行\morpheme处理的大写字母字符串。\@stringoflowers删除大写字母字符串并使用 处理剩余内容\@morphemize

解决方案(`' 添加以证明前后空格被保留):

\documentclass{article}%Answer for this question: http://tex.stackexchange.com/q/298719/89497
\makeatletter
    %document-level command to apply \morpheme to strings of uppercase characters (wrapper for \@morphemize)
    \newcommand{\morphemize}[1]{\spacesplit@morphemize#1 \nil\unskip}
    %document-level command applied to strings of uppercase characters
    \newcommand{\morpheme}[1]{\textsc{#1}}
    %document-level command applied to individual uppercase characters in strings up uppercase characters
    \newcommand{\xformupper}[1]{\lowercase{#1}}

    %Recursive command to split at spaces (otherwise they are lost)
    \def\spacesplit@morphemize#1 #2\nil{%
        \if\relax\detokenize{#1}\relax%preceeding space => eval the #2
            \space%
            \if\relax\detokenize{#2}\relax%#2 is empty=> do nothing
            \else
                \spacesplit@morphemize#2\nil%
            \fi
        \else%
            \@morphemize#1\nil%
            \if\relax\detokenize{#2}\relax%#2 is empty...do nothing
            \else
                \space\spacesplit@morphemize#2\nil%
            \fi
        \fi}

    %Recursive command to parse a string (without spaces) based on sequences of uppercase characters
    \def\@morphemize#1#2\nil{%
        \ifnum\is@upper#1\nil=1\relax%uppercase => treat differently
            \if\relax\detokenize{#2}\relax%#2 is empty => treat as a single captial (pass through)
                #1%
            \else%#2 is not empty => eval if next is capital
                \ifnum\is@upper#2\nil=1\relax%next char is uppercase => string of uppercases
                    \morpheme{\@stringofuppers#1#2\nil}%
                    \@stringoflowers#1#2\nil%
                \else%next char is not uppercase => pass through unchanged
                    #1\@morphemize#2\nil%
                \fi
            \fi
        \else%lowercase => pass through
            #1%
            \if\relax\detokenize{#2}\relax%#2 is empty => nothing left to parse
            \else%#2 is not empty => more to parse
                \@morphemize#2\nil%
            \fi
        \fi}

    %Command to return 1 if the first character is uppercase, 0 if other
    \def\is@upper#1#2\nil{%
        \ifx#1A 1\else
        \ifx#1B 1\else
        \ifx#1C 1\else
        \ifx#1D 1\else
        \ifx#1E 1\else
        \ifx#1F 1\else
        \ifx#1G 1\else
        \ifx#1H 1\else
        \ifx#1I 1\else
        \ifx#1J 1\else
        \ifx#1K 1\else
        \ifx#1L 1\else
        \ifx#1M 1\else
        \ifx#1N 1\else
        \ifx#1O 1\else
        \ifx#1P 1\else
        \ifx#1Q 1\else
        \ifx#1R 1\else
        \ifx#1S 1\else
        \ifx#1T 1\else
        \ifx#1U 1\else
        \ifx#1V 1\else
        \ifx#1W 1\else
        \ifx#1X 1\else
        \ifx#1Y 1\else
        \ifx#1Z 1\else
        0%
        \fi\fi\fi\fi\fi\fi\fi\fi\fi\fi
        \fi\fi\fi\fi\fi\fi\fi\fi\fi\fi
        \fi\fi\fi\fi\fi\fi}

    %Recursive command expanding to the string of uppercase characters at the beginning of #1#2
    \def\@stringofuppers#1#2\nil{%
        \ifnum\is@upper#1\nil=1\relax%uppercase => carry on
            \if\relax\detokenize{#2}\relax%#2 is empty => add a nill
                \xformupper{#1}%
            \else%#2 not empty => evaluate next char
                \xformupper{#1}\@stringofuppers#2\nil%
            \fi
        \else\fi}%lowercase => do nothing

    %Recursive command expanding to the string of lowercase characters after a string of upper case characters at the beginning of #1#2
    %process the remaining contents with \@morphemize
    \def\@stringoflowers#1#2\nil{%
        \ifnum\is@upper#1\nil=1\relax%uppercase=>carry on
            \if\relax\detokenize{#2}\relax%#2 is empty => the whole content was all caps...return nothing
            \else%#2 not empty => evaluate next char
                \@stringoflowers#2\nil%
            \fi
        \else%lowercase =>process with \@morphemize
            \@morphemize#1#2\nil%
        \fi}
\makeatother
\begin{document}
\noindent\verb|\morphemize{ pro.3PERS.FEM be.PAST-HABIT here}|

`\morphemize{ pro.3PERS.FEM be.PAST-HABIT here}'

\noindent\verb|\morphemize{ singleCapiTaL.LETERS here}|

`\morphemize{ singleCapiTaL.LETERS here}'

\noindent\verb|\morphemize{with.A PO.ST-space }|

`\morphemize{with.A PO.ST-space }'
\end{document}

语形化

我确信它可以稍微整理一下(expl3事后看来,这将是一个更好的选择)。此外,它应该是可扩展的,取决于用户定义的\morpheme\xformupper

相关内容