在回答中一个问题在另一个网站上,当我尝试从此 URL 获取数据时,我发现curl
和之间存在奇怪的差异:wget
https://www.uniprot.org/uniprot/A2Z669.fasta
由于某种原因,curl
只是默默地无法下载,而wget
正确地获取文件A2Z669.fasta
:
$ ls -la
total 300
drwxr-xr-x 2 terdon terdon 266240 Dec 11 12:22 .
drwxr-xr-x 202 terdon terdon 32768 Dec 10 17:31 ..
$ curl https://www.uniprot.org/uniprot/A2Z669.fasta
$ ls -la
total 300
drwxr-xr-x 2 terdon terdon 266240 Dec 11 12:22 .
drwxr-xr-x 202 terdon terdon 32768 Dec 10 17:31 ..
显式设置输出文件没有帮助,只会创建一个空文件:
$ curl -o file "https://www.uniprot.org/uniprot/A2Z669.fasta"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
$ ls -la
total 300
drwxr-xr-x 2 terdon terdon 266240 Dec 11 12:25 .
drwxr-xr-x 202 terdon terdon 32768 Dec 10 17:31 ..
-rw-r--r-- 1 terdon terdon 0 Dec 11 12:25 file
$ cat file
$
然而,wget
工作得很好:
$ wget https://www.uniprot.org/uniprot/A2Z669.fasta
--2023-12-11 12:24:42-- https://www.uniprot.org/uniprot/A2Z669.fasta
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving www.uniprot.org (www.uniprot.org)... 193.62.193.81
Connecting to www.uniprot.org (www.uniprot.org)|193.62.193.81|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://rest.uniprot.org/uniprot/A2Z669.fasta [following]
--2023-12-11 12:24:42-- https://rest.uniprot.org/uniprot/A2Z669.fasta
Resolving rest.uniprot.org (rest.uniprot.org)... 193.62.193.81
Connecting to rest.uniprot.org (rest.uniprot.org)|193.62.193.81|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://rest.uniprot.org/uniprotkb/A2Z669.fasta [following]
--2023-12-11 12:24:43-- https://rest.uniprot.org/uniprotkb/A2Z669.fasta
Reusing existing connection to rest.uniprot.org:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘A2Z669.fasta’
A2Z669.fasta [ <=> ] 314 --.-KB/s in 0s
2023-12-11 12:24:43 (6.65 MB/s) - ‘A2Z669.fasta’ saved [314]
$ ls -la
total 304
drwxr-xr-x 2 terdon terdon 266240 Dec 11 12:24 .
drwxr-xr-x 202 terdon terdon 32768 Dec 10 17:31 ..
-rw-r--r-- 1 terdon terdon 314 Dec 11 12:24 A2Z669.fasta
它似乎也不特定于该特定文件。我尝试了来自同一个 REST API 的另一个 URL (https://www.uniprot.org/uniprot/P05067.fasta) 并得到了相同的行为。
我在 Arch 系统上运行它:
$ wget --version | head -n1
GNU Wget 1.21.4 built on linux-gnu.
$ curl --version | head -n1
curl 8.4.0 (x86_64-pc-linux-gnu) libcurl/8.4.0 OpenSSL/3.1.4 zlib/1.3 brotli/1.1.0 zstd/1.5.5 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.4) libssh2/1.11.0 nghttp2/1.58.0
这里发生了什么?wget
当失败时什么会起作用curl
?
答案1
wget
默认情况下遵循重定向,但curl
不遵循。如果添加-L
,curl
则效果很好:
curl -OL https://www.uniprot.org/uniprot/A2Z669.fasta
(-O
告诉curl
输出到文件而不是标准输出,匹配的wget
默认行为。)