如何查找文本文件中排除一个用户给定单词的单词数

Question 1

使用 GNU grep：

grep -Eo '\S+' < file | grep -vcxF stopword

会计算 ( -c) 单词的数量（与单词至少在有效文本上，它是不完全是 ( )的wc -w非空格字符 ( \S+))序列。-v-xFstopword

Answer

使用 GNU grep：

grep -Eo '\S+' < file | grep -vcxF stopword

会计算 ( -c) 单词的数量（与单词至少在有效文本上，它是不完全是 ( )的wc -w非空格字符 ( \S+))序列。-v-xFstopword

Question 2

中的单词数input减去stopwords 的数量（使用GNU grep 的-o，因为您标记了 Linux）：

echo $(( $(wc -w < input) - $( grep -o stopword input | wc -l ) ))

输入示例：

I have the large set of the text file. In that, each article is separated by 15 stopwords. I want to find out the total number of words count in that file excluding the stopword.
stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword
I have the large set of the text file. In that, each article is separated by 15 stopwords. I want to find out the total number of words count in that file excluding the stopword.

输出：

$ echo $(( $(wc -w < input) - $( grep -o stopword input | wc -l ) ))
66

Answer

中的单词数input减去stopwords 的数量（使用GNU grep 的-o，因为您标记了 Linux）：

echo $(( $(wc -w < input) - $( grep -o stopword input | wc -l ) ))

输入示例：

I have the large set of the text file. In that, each article is separated by 15 stopwords. I want to find out the total number of words count in that file excluding the stopword.
stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword stopword
I have the large set of the text file. In that, each article is separated by 15 stopwords. I want to find out the total number of words count in that file excluding the stopword.

输出：

$ echo $(( $(wc -w < input) - $( grep -o stopword input | wc -l ) ))
66

Question 3

awk '{ gsub("stopword",""); words+=NF }; END { print words; }' /text/file

这会计算所有awk涉及字段的内容。即使它在语义上不是一个像这样的词

连字符
空格后加一个点（句子结尾错误。下一个句子）
标题中的数字（1.简介）

Answer

awk '{ gsub("stopword",""); words+=NF }; END { print words; }' /text/file

这会计算所有awk涉及字段的内容。即使它在语义上不是一个像这样的词

连字符
空格后加一个点（句子结尾错误。下一个句子）
标题中的数字（1.简介）

如何查找文本文件中排除一个用户给定单词的单词数

答案1

答案2

答案3

相关内容