使用 Wget 下载所有文件夹、子文件夹和文件

Question 1

我想假设你还没有尝试过这个：

wget -r --no-parent http://www.mysite.com/Pictures/

或者检索内容，而不下载“index.html”文件：

wget -r --no-parent --reject "index.html*" http://www.mysite.com/Pictures/

Answer

我想假设你还没有尝试过这个：

wget -r --no-parent http://www.mysite.com/Pictures/

或者检索内容，而不下载“index.html”文件：

wget -r --no-parent --reject "index.html*" http://www.mysite.com/Pictures/

Question 2

我用wget -rkpN -e robots=off http://www.example.com/

-r表示递归

-k表示转换链接。因此网页上的链接将是 localhost，而不是 example.com/bla

-p表示获取所有网页资源，从而获取图片和javascript文件以使网站正常运行。

-N用于检索时间戳，因此如果本地文件比远程网站上的文件新，则跳过它们。

-e是一个标志选项，它需要存在才能robots=off工作。

robots=off表示忽略 robots 文件。

我也在-c这个命令中，所以如果连接断开，当我重新运行命令时，它会继续从中断的地方继续。我想-N会很好-c

Answer