access.log 中的 Referrer 是一个目录

access.log 中的 Referrer 是一个目录

看起来以下日志上的引用者是一个文件夹。

112.200.208.5 - - [29/Jul/2013:20:43:14 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 294677 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"
61.3.158.113 - - [29/Jul/2013:20:43:14 +0800] "GET /sites/default/files/download/lnosKHEN/payroll_system_-_lnoskhen_0.zip HTTP/1.1" 206 10806 "http://www.mysite.com/download-code" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.25 Safari/534.3"
112.200.208.5 - - [29/Jul/2013:20:43:15 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 21465 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"
112.200.208.5 - - [29/Jul/2013:20:43:16 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 469304 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"
112.200.208.5 - - [29/Jul/2013:20:43:17 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 238639 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"
112.200.208.5 - - [29/Jul/2013:20:43:18 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 267724 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"
39.41.211.234 - - [29/Jul/2013:20:43:22 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 23361 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:23 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 200 632601 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:24 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 285171 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:24 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 138366 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:25 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 104108 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:25 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 52055 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:25 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 63038 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
39.41.211.234 - - [29/Jul/2013:20:43:27 +0800] "GET /sites/default/files/download/john.lemar/zest-project.zip HTTP/1.1" 206 32452 "http://www.mysite.com/sites/default/files/download/john.lemar/" "Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1"
112.200.208.5 - - [29/Jul/2013:20:43:33 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 215059 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"

我认为,唯一有效的下载是这一行:

61.3.158.113 - - [29/Jul/2013:20:43:14 +0800] "GET /sites/default/files/download/lnosKHEN/payroll_system_-_lnoskhen_0.zip HTTP/1.1" 206 10806 "http://www.mysite.com/download-code" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) 

因为我设置了所有下载都来自这个 URL:

http://www.mysite.com/download-code

那么,引荐来源为何似乎来自文件夹?

就像这一行:

112.200.208.5 - - [29/Jul/2013:20:43:33 +0800] "GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1" 206 215059 "http://www.mysite.com/sites/default/files/download/argie/" "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20100101 Firefox/22.0"

推荐人是:

http://www.mysite.com/sites/default/files/download/argie/

这一行:

/sites/default/files/download/argie/

是一个文件夹。

即使这是一个网络爬虫,它有可能访问我的网站上的文件夹吗?

当我手动输入以下内容时:

http://www.mysite.com/sites/default/files/download/argie/

它只会返回“页面未找到”。这就是为什么我想知道它是如何成为引荐来源的。

顺便说一句,我正在使用 nginx。

答案1

您不应该过多关注 referer。客户端可以将 referer 设置为任意值。它只是请求中的一个 header。

例如

GET /sites/default/files/download/argie/pos-code.zip HTTP/1.1
Host: www.mysite.com
Referer: http://example.org/JUST/SOME/REFERRER

所以我猜爬虫只是切断了路径的末尾并将其设置为引荐来源。我不担心。

相关内容