我在用:
wget -r -A pdf https://labraj.uni-mb.si
但我在控制台中得到的不是 pdf,而是这个输出:
--2013-03-23 15:11:03-- https://labraj.uni-mb.si/
Resolving labraj.uni-mb.si (labraj.uni-mb.si)... 164.8.230.26
Connecting to labraj.uni-mb.si (labraj.uni-mb.si)|164.8.230.26|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://labraj.uni-mb.si/Splo%C5%A1ne_informacije [following]
--2013-03-23 15:11:09-- https://labraj.uni-mb.si/Splo%C5%A1ne_informacije
Reusing existing connection to labraj.uni-mb.si:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
labraj.uni-mb.si: Permission deniedlabraj.uni-mb.si/index.html: No such file or directory
Cannot write to `labraj.uni-mb.si/index.html' (Success).
我怎样才能正确地做到这一点?
答案1
您可以使用此示例,但我不认为它会从子文件夹中抓取。
从http://www.thegeekstuff.com/2009/09/the-ultimate-wget-download-guide-with-15-awesome-examples/
14. Download Only Certain File Types Using wget -r -A
您可以在下列情况下使用它:
从网站下载所有图片 从网站下载所有视频 从网站下载所有 PDF 文件
$ wget -r -A.pdf http://url-to-webpage-with-pdfs/
您可能还需要更改浏览器代理,以免您的浏览器wget
显示为蜘蛛。