为什么此命令不根据 uniq 计数进行排序?

为什么此命令不根据 uniq 计数进行排序?

我的日志中有类似以下内容的行:

2015/11/02-07:55:39.735 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.11:61618) is not a trusted source.
2015/11/02-07:55:40.515 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.11:51836) is not a trusted source.
2015/11/02-07:55:39.735 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.10:61615) is not a trusted source.
2015/11/02-07:55:40.515 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.10:51876) is not a trusted source.
2015/11/02-07:55:39.735 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.10:61614) is not a trusted source.
2015/11/02-07:55:39.735 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.15:61614) is not a trusted source.
2015/11/02-07:55:39.735 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.15:61618) is not a trusted source.
2015/11/02-07:55:39.735 INFO failed with ERR_AUTHORIZATION_REQUIRED.  (10.10.10.15:61613) is not a trusted source.

因此,我尝试了以下命令来获取每个 uniq IP 的计数(已排序):

grep ERR_AUTHORIZATION_REQUIRED file.log | awk '{print $6}' | cut -s -d ':' -f1 | tr -d '(' | sort | uniq -c

我得到的输出类似于以下内容:

3 10.10.10.10
2 10.10.10.11
3 10.10.10.15

因此,就像在应用之前对 IP 进行排序uniq -c(这在给出命令的情况下是有意义的),但如果我交换uniqsort命令,每个 IP 都会打印出 的计数1

答案1

uniq联机帮助页:

DESCRIPTION
     Discard all but one of successive identical lines from INPUT (or standard input), writing to OUTPUT (or standard output).

这里的关键词是“连续”。它不会在流中的任何点搜索重复项,只会搜索紧随其后的重复项。排序会强制所有重复项彼此相邻,以便可以将它们删除(并进行计数)。

相关内容