我在 Ubuntu 服务器上安装了 Lighttpd。我刚刚检查了 lighttpd 对某个特定域的访问日志。这个域只有一个非常简单的index.html
文件,上面基本上写着“即将推出”。下面是最近的 10 个。我不太明白。为什么搜索引擎机器人会尝试这些奇怪的子域和 URL?我发现以下机器人在做奇怪的事情:mail.ru、bing、baidu。Google 和 Yahoo 正在不是example.com
在日志中发现。我当然已将域名更改为以保护它。
217.69.133.239 power-steering-pump-ford.example.com - [31/Dec/2014:05:17:37 -0500] "GET /robots.txt HTTP/1.1" 404 345 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
217.69.133.240 power-steering-pump-ford.example.com - [31/Dec/2014:05:17:39 -0500] "GET /bedroom-boy-furniture-quality.html/ HTTP/1.1" 404 345 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
217.69.133.238 power-steering-pump-ford.example.com - [31/Dec/2014:05:17:44 -0500] "GET /10-car-hottest-top.html/ HTTP/1.1" 404 345 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
157.55.39.173 best-mixed-drink-recipes.example.com - [31/Dec/2014:05:26:43 -0500] "GET / HTTP/1.1" 200 187 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
217.69.133.234 cannon-printer-model-mp450.example.com - [31/Dec/2014:05:31:49 -0500] "GET /robots.txt HTTP/1.1" 404 345 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
217.69.133.240 cannon-printer-model-mp450.example.com - [31/Dec/2014:05:31:50 -0500] "GET / HTTP/1.1" 200 187 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
217.69.133.240 smart-car-bike-rack.example.com - [31/Dec/2014:05:31:52 -0500] "GET /robots.txt HTTP/1.1" 404 345 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
217.69.133.238 smart-car-bike-rack.example.com - [31/Dec/2014:05:31:54 -0500] "GET / HTTP/1.1" 200 187 "-" "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)"
202.46.53.179 winter-clothing-for-kids.example.com - [31/Dec/2014:05:52:05 -0500] "GET / HTTP/1.1" 200 230 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
180.76.4.195 winter-clothing-for-kids.example.com - [31/Dec/2014:05:52:47 -0500] "GET / HTTP/1.1" 200 230 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
答案1
这个问题似乎只是由该域名以前拥有时的反向链接引起的。由于我的服务器设置为任何子域名都返回 200(无错误),因此问题似乎更加严重。
为了解决这个问题。我将更改配置,以便错误的子域返回 404,并可能将错误链接报告给索引它们的搜索引擎。
抱歉,我将 ServerFault 当作橡皮鸭,感谢您的反对票。