wget 递归文件夹/文件下载仅返回多个index.html

2024-11-15 • tag-icon

wget --no-parent --recursive --level=4 https://www.example.com/jsons/

我正在使用上面这行代码尝试下载该 URL 子文件夹中的所有文件。

文件夹结构如下：

https://www.example.com/jsons/XX/(multiple files)

其中‘XX’是各种两位数的文件夹名称。

它会下载所有文件夹，但只会将 index.html 文件下载到每个文件夹中。不会下载其他文件（本例中为 .json）。

如果我特别指定其中一个“XX”文件夹，它会按预期下载该文件夹的所有文件（包括 .json）。

wget --no-parent --recursive --level=4 https://www.example.com/jsons/BS/

使用 1.21.4 并在 Windows cmd 窗口中运行它。

这是 wget 对单个文件夹的详细输出（片段）：

--2023-07-05 20:23:52-- https://www.example.com/jsons/BL/重用现有连接www.example.com:443。HTTP 请求已发送，等待响应... 200 OK 长度：未指定 [text/html] 保存到：“www.example.com/jsons/BL/index.html”

www.example.com/jsons [ <=> ] 8.95K --.-KB/秒，0.001 秒内

有任何想法吗？

相关内容