提取每个单词的首字母,包括破折号等特殊字符后的首字母

提取每个单词的首字母,包括破折号等特殊字符后的首字母

这个问题基于这个答案

发现字母出现在破折号旁边时缺失,如以下 MWE 所示:

\documentclass{article}
\usepackage{readarray}
\usepackage{ifthen}
\newcounter{index}\setcounter{index}{0}
\def\firstletters#1{%
  \getargsC{#1}%
  \whiledo{\theindex<\narg}{%
    \stepcounter{index}%
    \edef\nextword{\csname arg\romannumeral\theindex\endcsname}%
    \expandafter\getfirst\nextword\relax%
  }%
}
\def\getfirst#1#2\relax{#1}
\begin{document}
\firstletters{This is a test of the Emergency Broadcast System. This-Test. for sample. This T.}
\end{document}

输出

答案1

datatool软件包提供了\DTLinitials。例如:

\documentclass{article}

\usepackage{datatool-base}

\begin{document}

\DTLinitials{This is a test of the Emergency Broadcast System.
This-Test. for sample. This T.}

\end{document}

提亚托EBST-TfsTT

这会自动在每个首字母后插入一个句点,但可以通过重新定义 和 来防止这种\DTLafterinitials情况\DTLbetweeninitials发生\DTLafterinitialbeforehyphen

\documentclass{article}

\usepackage{datatool-base}

\renewcommand*{\DTLbetweeninitials}{}
\renewcommand*{\DTLafterinitials}{}
\renewcommand*{\DTLafterinitialbeforehyphen}{}

\begin{document}

\DTLinitials{This is a test of the Emergency Broadcast System.
This-Test. for sample. This T.}

\end{document}

提亚托EBST-TfsTT

如果您需要可扩展上下文中的首字母缩写,则首先需要使用\DTLstoreinitials,它将在第二个参数提供的命令中保存首字母缩写:

\DTLstoreinitials{This is a test of the Emergency Broadcast System.
This-Test. for sample. This T.}{\initials}

\initials

编辑:如果您还想从首字母中删除连字符,只需重新定义\DTLinitialhyphen为不执行任何操作即可:

\renewcommand*{\DTLinitialhyphen}{}

编辑 2:请注意,它\DTLinitials主要用于名称(其最初目的是与 提供的缩写书目样式一起使用databib),因此它假定其参数是一系列由空格或连字符分隔的字母。此外,从手动的

如果首字母带有重音,请小心。如果您希望首字母也带有重音,则需要将带重音的字母放在一个组中,否则重音命令将被忽略。

因此,根据您下面的评论:

\DTLinitials{{\"{O}}zg\"{u}r}

或者使用 XeLaTeX 或 LuaLaTeX 和 UTF-8 字符。这类似于\makefirstuc(来自mfirstuc)的限制

另请参阅datatool手册:

事实上,任何出现在名称开头且未被组括起来的命令都会被忽略。

这意味着,

\DTLinitials{\MakeUppercase{m}ary ann}

会产生ma不是Ma

答案2

以下是仅基于经典 TeX 的解决方案:

\def\firstletters{\bgroup \catcode`-=10 \catcode`(=10 \filA}
\def\filA#1{\filB#1 {\end} }
\def\filB#1#2 {\ifx\end#1\egroup \else#1\expandafter\filB\fi} 

\firstletters{This is a test of the Emergency Broadcast System. 
   This-Test. for sample (per se). This T.}

\bye

答案3

使用正则表达式,我们可以删除从字母到空格或连字符的所有内容。

\documentclass{article}
\usepackage{xparse,l3regex}

\ExplSyntaxOn
\NewDocumentCommand{\firstletters}{m}
 {
  \kumaresh_firstletters:n { #1 }
 }

\tl_new:N \l_kumaresh_fl_input_tl

\cs_new_protected:Nn \kumaresh_firstletters:n
 {
  \tl_set:Nn \l_kumaresh_fl_input_tl { #1 ~ }
  \regex_replace_all:nnN { ([A-Za-z]).*?[-\s]} { \1 } \l_kumaresh_fl_input_tl
  \tl_use:N \l_kumaresh_fl_input_tl
 }
\ExplSyntaxOff

\begin{document}
\firstletters{This is a test of the Emergency Broadcast System. This-Test. for sample. This T.}
\end{document}

在此处输入图片描述

答案4

这是另一个基于 LuaLaTeX 的解决方案。它测试字符串是否包含任何字母字符,如果没有找到字母字符,则不执行任何操作。它不假设字符串的第一个字符是字母类型字符。建议的解决方案可以处理非 ASCII 编码的字母,例如äÄÅ

在此处输入图片描述

\documentclass{article}
\usepackage{fontspec}
\usepackage{luacode} % for 'luacode' env. and '\luaexec' macro
\begin{luacode}
local i, w , wstring
function fl ( s )
   i = unicode.utf8.find ( s , "%w")
   -- Do nothing if i=="nil", i.e., if 's' doesn't 
   -- contain at least one alphabetical character:
   if i ~= nil then
      -- Pick up the first letter of first word:
      wstring = unicode.utf8.sub ( s , i , i ) 
      s = unicode.utf8.sub ( s , i+1 )
      -- Pick up the first letters of all remaining words:
      for w in unicode.utf8.gmatch ( s , "%W%w" ) do
         wstring = wstring .. unicode.utf8.sub ( w , 2 )
      end
      tex.sprint ( wstring )
   end
end
\end{luacode}
\newcommand{\firstletter}[1]{\luaexec{fl(\luastring{#1})}}

\begin{document}
\firstletter{This is a test of the Emergency Broadcast System. This-Test. for sample. This T. per se}

% Same string, but with additional non-letter characters
\firstletter{@--?#&$() []<>^_ This is a test of the 
   Emergency    Broadcast System. This--Test. 
   for sample. This T. 
   (per se)}

% Words that start with non-ASCII-encoded characters
\firstletter{$$$ähnlich "öffentlich *übrigens !?<>Äpfel 
   Özgür  ((((^Übung    .ßcheusslich+++ ,===Ångstrom}

\firstletter{!@#$^&*()!@#$^&*()_+-={}[]|\\;<>?Ö} 

% Two strings without any "words"
a\firstletter{"("§$&/)@@=}b\firstletter{}c 

\end{document}

相关内容