我使用的命令:wget www.fivestarmazda.com/index.htm 在 digital ocean 托管的 ubuntu 14.10 机器上有效 在 chrome 浏览器中有效 在 rackspace 托管的 ubuntu 13.10 环境中无效。在那里,我一直收到 403 Forbidden 错误。你们知道为什么吗?在所有环境中,我都可以使用 wgethttp://www.google.com
来自 wget 的完整调试消息:
DEBUG output created by Wget 1.14 on linux-gnu.
URI encoding = ‘UTF-8’
--2015-03-11 10:14:36-- http://www.fivestarmazda.com/index.htm
Resolving www.fivestarmazda.com (www.fivestarmazda.com)... 23.64.122.224
Caching www.fivestarmazda.com => 23.64.122.224
Connecting to www.fivestarmazda.com (www.fivestarmazda.com)|23.64.122.224|:80... connected.
Created socket 3.
Releasing 0x0000000001eea330 (new refcount 1).
---request begin---
GET /index.htm HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: www.fivestarmazda.com
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 403 Forbidden
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Content-Length: 964
X-DDC-Arch-Trace: ,HttpResponse
Date: Wed, 11 Mar 2015 14:14:46 GMT
Connection: keep-alive
---response end---
403 Forbidden
Registered socket 3 for persistent reuse.
URI content encoding = ‘utf-8’
Skipping 964 bytes of body: [<html><head><title>Apache Tomcat/6.0.20 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} Skipping 452 bytes of body: [P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 403 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>Access to the specified resource () has been forbidden.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.20</h3></body></html>] done.
2015-03-11 10:14:36 ERROR 403: Forbidden.
答案1
不确定是否有人关心,但来自加拿大:
$ wget www.fivestarmazda.com/index.htm
--2018-03-14 17:04:34-- http://www.fivestarmazda.com/index.htm
Resolving www.fivestarmazda.com (www.fivestarmazda.com)... 151.101.52.247
Connecting to www.fivestarmazda.com (www.fivestarmazda.com)|151.101.52.247|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.fivestarmazda.com/index.htm [following]
--2018-03-14 17:04:34-- https://www.fivestarmazda.com/index.htm
Connecting to www.fivestarmazda.com (www.fivestarmazda.com)|151.101.52.247|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: / [following]
--2018-03-14 17:04:35-- https://www.fivestarmazda.com/
Reusing existing connection to www.fivestarmazda.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 283589 (277K) [text/html]
Saving to: ‘index.htm’
index.htm 100%[==========================>] 276.94K 1.78MB/s in 0.2s
2018-03-14 17:04:35 (1.78 MB/s) - ‘index.htm’ saved [283589/283589]
哎呀,又丢失了 280KB。
答案2
不确定是否仍然需要,但很简单'欺骗'用户代理通常会有所帮助。例如:
wget --user-agent="Mozilla/4.0 (Windows; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)" /HERE GOES YOUR URL/