wget 文件获取 neol 转换的文本如何获取正确的文件

Question

Web 服务器在响应头中提供有关响应主体的信息。

为了仅查看标题，我们可以运行：

$ wget --spider --server-response https://www.gutenberg.org/cache/epub/100/pg100.txt  
Spider mode enabled. Check if remote file exists.
--2019-10-14 09:13:55--  https://www.gutenberg.org/cache/epub/100/pg100.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: Apache
  Content-Location: pg100.txt.utf8.gzip
  Vary: negotiate
  TCN: choice
  Last-Modified: Sun, 01 Oct 2017 05:16:47 GMT
  X-Frame-Options: sameorigin
  X-Connection: Close
  Content-Type: text/plain; charset=utf-8
  Content-Encoding: gzip
  X-Powered-By: 1
  Content-Length: 2023394
  Date: Mon, 14 Oct 2019 13:13:55 GMT
  X-Varnish: 1859043781 1856607983
  Age: 104391
  Via: 1.1 varnish
Length: 2023394 (1.9M) [text/plain]
Remote file exists.

一旦我们看到内容实际上是用 gzip 压缩的，我们就可以使用 gunzip 来解压缩它：

$ wget -O - https://www.gutenberg.org/cache/epub/100/pg100.txt | gunzip -c > pg100.txt

当页面在现代浏览器中显示出来的时候，你会发现浏览器已经帮我们完成了这个工作。

Answer 1

Web 服务器在响应头中提供有关响应主体的信息。

为了仅查看标题，我们可以运行：

$ wget --spider --server-response https://www.gutenberg.org/cache/epub/100/pg100.txt  
Spider mode enabled. Check if remote file exists.
--2019-10-14 09:13:55--  https://www.gutenberg.org/cache/epub/100/pg100.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: Apache
  Content-Location: pg100.txt.utf8.gzip
  Vary: negotiate
  TCN: choice
  Last-Modified: Sun, 01 Oct 2017 05:16:47 GMT
  X-Frame-Options: sameorigin
  X-Connection: Close
  Content-Type: text/plain; charset=utf-8
  Content-Encoding: gzip
  X-Powered-By: 1
  Content-Length: 2023394
  Date: Mon, 14 Oct 2019 13:13:55 GMT
  X-Varnish: 1859043781 1856607983
  Age: 104391
  Via: 1.1 varnish
Length: 2023394 (1.9M) [text/plain]
Remote file exists.

一旦我们看到内容实际上是用 gzip 压缩的，我们就可以使用 gunzip 来解压缩它：

$ wget -O - https://www.gutenberg.org/cache/epub/100/pg100.txt | gunzip -c > pg100.txt

当页面在现代浏览器中显示出来的时候，你会发现浏览器已经帮我们完成了这个工作。

wget 文件获取 neol 转换的文本如何获取正确的文件

答案1

相关内容