我正在尝试从 Nginx 访问日志中打印用户代理最频繁访问的列表。
到目前为止我已经想出了这个:
cat /var/log/nginx/access.log | awk -F\" '{print $6}' | sort | uniq -c | sort -nr | head -20
这确实为我提供了按访问该网站的次数排序的前 20 个用户代理。
我想进一步修改它,只计算过去 30 天内的访问量,但我正在努力获得一个有效的命令。
cat from=$(date -d'now-30 days' +[%d/%b:%H:%M:%S) to==$(date +[%d/%b:%H:%M:%S) /var/log/nginx/access.log | awk -F\" '{print $6}' | sort | uniq -c | sort -nr | head -20
这样做的目的是识别经常访问该网站的搜索引擎爬虫/机器人,因此如果您可以为此推荐任何其他内容,请告诉我。
我从下面的日志文件中摘取了几行示例:-
157.55.39.46 - - [11/Jun/2018:13:23:20 +0100] "GET /accessories/antique-gold--pine--wall-mounted?width=1308%2C1334 HTTP/1.1 "200 27726 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"[RT:3.041] [C:995725]
157.55.39.46 - - [11/Jun/2018:13:23:24 +0100] "GET /accessories/antique-gold--chrome--pine--wall-mounted?limit=25&mode=list HTTP/1.1 "200 34206 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"[RT:3.412] [C:995725]
54.36.148.11 - - [11/Jun/2018:13:23:29 +0100] "GET /bathroom-mirrors/mirrors/700 HTTP/1.1 "404 10570 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" "-"[RT:0.393] [C:995737]
目前,该命令打印以下内容:-
8603 Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
4051 Mozilla/5.0 (compatible; SemrushBot/2~bl; +http://www.semrush.com/bot.html)
1707 Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
1585 Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.79 Safari/537.36
1519 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
1368 Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
1185 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36
993 Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)
903 Mozilla/5.0 (iPhone; CPU iPhone OS 11_4 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Mobile/15E148 Safari/604.1
658 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
648 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.15
572 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0
435 Mozilla/5.0 (iPhone; CPU iPhone OS 11_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Mobile/15E148 Safari/604.1
336 Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)
324 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0
324 Mozilla/5.0 (iPad; CPU OS 10_3_3 like Mac OS X) AppleWebKit/603.1.30 (KHTML, like Gecko) CriOS/67.0.3396.69 Mobile/14G60 Safari/602.1
319 Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0
309 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
295 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.79 Safari/537.36
280 Mozilla/5.0 (iPad; CPU OS 11_4 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Mobile/15E148 Safari/604.1