如果能够根据任意指标来设置单词、句子或小节的样式,那就太好了。例如,根据单词的长度突出显示单词可能会很有用 - 较长的单词会以更粗的突出显示,这样您就可以缩小文档,并快速了解其中变得复杂的地方。另一个类似的例子可能是根据段落的可读性分数。
对于最后一个例子,这可能涉及将段落转换为纯文本、计算分数(外部?)、将分数映射到颜色,然后将颜色应用于段落。
是否可以在整个文档范围内执行类似的操作?即不必在每个段落周围添加命令?
是否可以在 tex 内轻松计算单词/句子/段落的长度,还是应该在外部完成?
答案1
我对 egreg 发布的的代码做了一些小的修改这里并得出了以下结论:
看起来还行,但它不知道如何连字。我不知道如何让它连字,甚至不知道这是否可行。因此,有两种选择:(1) 在任何地方拆分单词以使其整洁,或 (2) 不破坏任何东西,让内容悬在页边空白处。这两种选择都包含在下面,并在评论中指出。
可以选择将每个字母打印在其阴影框中。但是,这似乎会使所有内容变暗,并使突出显示变得不那么明显。
您可以通过输入不同的(整数)值来将突出显示更改为更深或更亮
\int_set:Nn \g_shade_weight_int {6}
您不必将每个段落放入其自己的块中,但是您必须将文本块放入宏中。
当遇到数学时它就会爆炸。
如果出现任何问题,我很可能无法修复它。如前所述,这主要是复制/粘贴解决方案。
方法是:将文本块分成段落,将得到的段落分成单词,测量每个单词的长度并根据长度设置阴影,使用 egregs 解决方案将单词的每个字母放在相应的阴影中。
代码:
\documentclass{article}
\usepackage{xparse}
\usepackage{xcolor}
\ExplSyntaxOn
\makeatletter
\int_new:N \g_shade_weight_int
% sets the shade weight, bigger number for darker boxes, with this set to 6, a word
% with 16-17 letters will be black.
\int_set:Nn \g_shade_weight_int {6}
\int_new:N \l_length_int
\int_new:N \l_shade_int
\tl_new:N \l_current_color_tl
% code stolen from egreg with some small additions https://tex.stackexchange.com/a/57860/14100
\cs_new:Npn \show_boxes:n #1
{
\int_set:Nn \l_length_int {\tl_count_tokens:n {#1}}
\int_set:Nn \l_shade_int {\int_min:nn {\g_shade_weight_int*\l_length_int}{100}}
\tl_set:No \l_current_color_tl {black!\int_use:N \l_shade_int}
\begingroup\fboxrule=0pt\fboxsep=0pt
\show_boxes_aux:nn#1\show_boxes_aux:nn\@empty
\endgroup
}
\cs_new:Npn \show_boxes_aux:nn #1#2
{
% The first part of the conditional deals with the last letter of each word,
% and the \else deals with the remaining letters. Switching to the commented
% fbox commands, will print each letter in the corresponding box.
% Currently, this breaks words at any point, deleting "\hskip 0pt" will leave
% all words unbroken and produce words extending into the margins.
\ifx#2\show_boxes_aux:nn
%\fbox{\colorbox{\tl_use:N \l_current_color_tl}{#1}}\expandafter\@gobble
\fbox{\colorbox{\tl_use:N \l_current_color_tl}{\textcolor{\tl_use:N \l_current_color_tl}{#1}}}\expandafter\@gobble
\else
\setbox0=\hbox{\vbox{{#1}{#2}}}\setbox2=\hbox{\vbox{#1#2}}%
\dimen0=\wd0 \advance\dimen0 -\wd2
%\fbox{\colorbox{\tl_use:N \l_current_color_tl}{#1}}\hskip 0pt\kern-\dimen0
\fbox{\colorbox{\tl_use:N \l_current_color_tl}{\textcolor{\tl_use:N \l_current_color_tl}{#1}}}\hskip 0pt\kern-\dimen0
\expandafter\show_boxes_aux:nn
\fi#2
}
\NewDocumentCommand {\colorwordsaux}
{ > { \SplitList { ~ } } m }
{ \tl_map_inline:nn {#1} { \show_boxes:n {##1}\ } }
\NewDocumentCommand {\colorwords}
{>{ \SplitList {\par}} +m}
{\tl_map_inline:nn {#1} {\colorwordsaux {##1}\par}}
\makeatother
\ExplSyntaxOff
\begin{document}
\colorwords{
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas a nulla augue. Nam a nisi eu elit consectetur cursus. Mauris mi ante, vulputate vel porttitor vitae, ultricies ac purus. Aliquam lacinia lacus a nisl elementum sit amet vehicula augue condimentum. Proin at turpis turpis, at facilisis leo. Integer convallis ligula eleifend quam sollicitudin consequat. Cras eget aliquet libero. Nam nulla lorem, tincidunt sed consequat eu, convallis sed quam. Aenean tincidunt lorem a dui vestibulum nec pharetra odio faucibus. Integer nec odio a sem faucibus rutrum. Maecenas vel lorem nisi, ut viverra est.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas a nulla augue. Nam a nisi eu elit consectetur cursus. Mauris mi ante, vulputate vel porttitor vitae, ultricies ac purus. Aliquam lacinia lacus a nisl elementum sit amet vehicula augue condimentum. Proin at turpis turpis, at facilisis leo. Integer convallis ligula eleifend quam sollicitudin consequat. Cras eget aliquet libero. Nam nulla lorem, tincidunt sed consequat eu, convallis sed quam. Aenean tincidunt lorem a dui vestibulum nec pharetra odio faucibus. Integer nec odio a sem faucibus rutrum. Maecenas vel lorem nisi, ut viverra est.
}
\end{document}
下面的输出与上面的lipsum不对应,下面的第一块是苏斯博士的一首诗(帽子里的猫)的一部分,第二块是ArXiv上一篇难以理解的理论物理论文的摘要。
编辑:这是一种根据以下内容对段落进行着色的方法甘宁雾指数。由于计算音节似乎不太容易,我做了一些手脚。通过句号计算句子也很幼稚,所以这远非完美,然而,结果与给出的结果相当吻合一些 在线的 计算器。
\documentclass{article}
\usepackage{xparse}
\usepackage{mdframed}
\usepackage{xcolor}
\ExplSyntaxOn
\int_new:N \l_letter_count_int
\int_new:N \l_word_count_int
\int_new:N \l_sentence_count_int
\int_new:N \l_complex_count_eight_int
\int_new:N \l_complex_count_nine_int
\seq_new:N \l_sentences_seq
\seq_new:N \l_words_seq
\fp_new:N \l_avg_sent_length_fp
\fp_new:N \l_complex_count_fp
\fp_new:N \l_percent_comp_fp
\fp_new:N \l_block_score_fp
\fp_new:N \l_shade_level_fp
\NewDocumentCommand { \colorparagraph } { m }
{
\int_zero:N \l_letter_count_int
\int_zero:N \l_word_count_int
\int_zero:N \l_sentence_count_int
\int_zero:N \l_complex_count_nine_int
\int_zero:N \l_complex_count_eight_int
%split paragraph at periods and count sentences, this doesn't account for
%in sentence periods, e.g., e.g.
%Changing the first arg from {.} to {.~} requires a space after the period but
%then misses things like (...blah.) Then...
\seq_set_split:Nnn \l_sentences_seq {.} {#1}
\seq_remove_all:Nn \l_sentences_seq {}
\int_set:Nn \l_sentence_count_int {\seq_count:N \l_sentences_seq}
%split paragraph at spaces and count words.
\seq_set_split:Nnn \l_words_seq {~} {#1}
\seq_remove_all:Nn \l_words_seq {}
\int_set:Nn \l_word_count_int {\seq_count:N \l_words_seq}
%calculate average sentence length
\fp_set:Nn \l_avg_sent_length_fp {\l_word_count_int/\l_sentence_count_int}
%complex words = 3+ syllables...I fudged this.
\seq_map_inline:Nn \l_words_seq
{
\int_compare:nT {\tl_count:n {##1} >= 9}
{\int_incr:N \l_complex_count_nine_int}
\int_compare:nT {\tl_count:n {##1} >= 8}
{\int_incr:N \l_complex_count_eight_int}
}
% this is a fudge to estimate the number of 3+ syllable words
\fp_set:Nn \l_complex_count_fp {(\l_complex_count_nine_int+\l_complex_count_eight_int)/2}
% find the % of words that are "complex"
\fp_set:Nn \l_percent_comp_fp {\l_complex_count_fp/\l_word_count_int}
% apply fog index formula
\fp_set:Nn \l_block_score_fp {.4*(\l_avg_sent_length_fp+100*\l_percent_comp_fp)}
% assume that 35 is the maximum likely fog index, and calculate what percent of this
% number the current paragraph is.
\fp_set:Nn \l_shade_level_fp {\l_block_score_fp/35*100}
\begin{mdframed}
[hidealllines=true,
% the percentage of black = percent of "max" fog index.
backgroundcolor=black!\fp_to_int:N \l_shade_level_fp,
innerleftmargin=3pt,
innerrightmargin=3pt,
leftmargin=-3pt,
rightmargin=-3pt]
#1
\end{mdframed}
% uncomment below to display fog index below each paragraph
%\fp_use:N \l_block_score_fp
}
\NewDocumentCommand {\colorparagraphs}
{>{ \SplitList {\par}} +m}
{\tl_map_inline:nn {#1} {\colorparagraph {##1}\par}}
\ExplSyntaxOff
\begin{document}
%text from http://www.impact-information.com/impactinfo/newsletter/plwork08.htm
\colorparagraphs{
Cardiac resynchronization therapy signals a new era in device-based solutions for this condition and 750,000 of the estimated five million Americans with heart failure could potentially benefit from it. Typically a late manifestation of other cardiovascular diseases, including coronary artery disease, hypertension and valvular disease, heart failure is responsible for more hospitalizations than all forms of cancer combined. As the only major cardiac disorder increasing in prevalence, it is estimated that 550,000 cases of heart failure are diagnosed each year. Approximately \$40 billion is spent to manage the condition in the United States each year.
Here is the same information written at the 7th-grade level:
New pacemakers like this regulate the heart beat. They can help 750,000 of the five million Americans who suffer from heart failure. Heart failure is a growing problem. It results from high blood pressure and disease in the arteries and valves of the heart. This year, 550,000 people suffered heart failure. More people went to the hospital with heart failure than all forms of cancer combined. In the U.S., we spend about \$40 billion a year on it.
}
\end{document}
第一篇课文是针对研究生水平编写的,第二篇课文是针对七年级水平编写的。计算得出的迷雾指数分别为 ~18.5 和 ~7。下图来自我测试过的其他课文。这些课文的迷雾指数分别为 18、7、23 和 11。