是否有一个命令行程序可以获取包含英文文本的文件,分析该文本并输出其可读性分数?
例如,如果向程序输入文本,程序应该输出 Flesch-Kincaid 等级、McLaughlin 的 SMOG 等级等。
我相信官方存储库中存在这样的程序,但我记不住它的名字。也有可能我记错了。
答案1
这diction
软件包中包含一个工具,名为style
:
Style
分析文档写作风格的表面特征。它打印出各种可读性等级、单词、句子和段落的长度。它可以进一步定位具有某些特征的句子。
例如,如果我评估您的问题主体(保存在文件中flux_question
)以打印可读性指数(ARI)超过 10 的句子:
$ style -r 10 flux_question
flux_question:1: Is there a command line program that takes a file containing English text, analyzes the text, and outputs its readability scores?
flux_question:2: For example, if one feeds the program a text, the program should output the Flesch-Kincaid grade level, McLaughlin's SMOG grading, etc.
readability grades:
Kincaid: 10.2
ARI: 10.8
Coleman-Liau: 12.5
Flesch Index: 51.1/100
Fog Index: 12.0
Lix: 48.6 = school year 9
SMOG-Grading: 11.2
sentence info:
333 characters
65 words, average length 5.12 characters = 1.65 syllables
4 sentences, average length 16.2 words
25% (1) short sentences (at most 11 words)
0% (0) long sentences (at least 26 words)
1 paragraphs, average length 4.0 sentences
25% (1) questions
25% (1) passive sentences
longest sent 21 wds at sent 2; shortest sent 8 wds at sent 4
word usage:
verb types:
to be (1) auxiliary (2)
types as % of total:
conjunctions 5% (3) pronouns 9% (6) prepositions 2% (1)
nominalizations 0% (0)
sentence beginnings:
pronoun (1) interrogative pronoun (0) article (0)
subordinating conjunction (0) conjunction (0) preposition (0)
为了过滤输出,您可以使用例如tail -n8
仅获取等级或grep 'Flesch\|SMOG'
仅打印 Flesch 指数和 SMOG 等级:
$ style flux_question | grep 'Flesch\|SMOG'
Flesch Index: 51.7/100
SMOG-Grading: 11.2