从网站下载文件（游戏模组）

Question

https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/提供最新文件的链接。可以使用下载站点的 html 文档curl，通过管道输出以提取下载链接（在下面使用简单的方式完成grep）并使用命令替换，将此链接传递给第二个curl命令。

curl -OJ \
    $(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -om1 '[^"]*entry-download/[^"]*')

希望这能给你一些可以借鉴的东西。

grep使用的选项：

-o/--only-matching仅输出匹配的模式，而不是包含该模式的整行
-m 1/--max-count=1在包含匹配的第一行之后停止搜索输入
要匹配的模式：[^"]*entry-download/[^"]*：下载链接似乎都给出了href="https://www.transportfever.net/filebase/index.php?entry-download/<number><...>"- 所以上面的模式匹配似乎足够了：零个或多个除双引号之外的任何字符"，后跟entry-download/，再后跟零个或多个除双引号之外的任何字符"

curl使用的选项（第一遍 - 在替换内）：

-f/--fail如果我们收到一个，则不输出任何内容4/5xxhttp回复– 请求失败，我们不想 grep 告诉我们失败的 html 文档
-s/--silent这是第一遍，我们不想看到进度条或任何东西

第二遍curl选项 - 这些下载链接使用content-disposition标头来告诉我们文件名，因此：

-O/--remote-name使用与远程文件相同的名称保存文件
-J/--remote-header-name允许-O选择使用服务器指定的 Content-Disposition 文件名，而不是从 URL 中提取文件名

实际上有多个entry-download/链接 - 要下载所有链接，我们可以删除-m1并调整要使用的grep第二个选项，如下所示：curl--remote-name-all

curl --remote-name-all -J \
    $(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -o '[^"]*entry-download/[^"]*')

文件冲突检查：

如果我们想content-disposition提前知道头部描述的文件名，则需要一个额外的步骤。我们可以使用curl来发送head请求：

# get first url from the page, storing it to
# the parameter 'url' so we can use it again later
url=$(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -om1 '[^" ]*entry-download/[^" ]*')

# head request to determine filename
filename=$(curl -Is "$url" | grep -iom1 '^content-disposition:.*filename="[^"]*' | grep -o '[^"]*$')

# 'if' statement using the 'test' / '[' command as the condition
if test -e "$filename"; then
    echo "$filename exists!"
else
    # a file named $filename doesn't exit,
    # so we'll download it
    curl -o "$filename" "$url"
fi

这是一个简单的例子，它在尝试下载之前检查冲突的文件
并不是真的有必要，因为curl -J不会覆盖现有文件，但我怀疑您想检查是否存在"$filename"– 可能没有.zip: "${filename%.zip}"– 在其他目录中，或者在某些文本文件中

在上述基础上，如果您想对所有提取的entry-download/网址执行此操作：

# extract all urls, placing them in an array parameter 'urls'
urls=( $(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -o '[^" ]*entry-download/[^" ]*') )

# loop over extracted urls
for i in "${urls[@]}"; do
    # do filename extraction for "$i"
    # use filename to determine if you want to download "$i"
done

Answer 1

https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/提供最新文件的链接。可以使用下载站点的 html 文档curl，通过管道输出以提取下载链接（在下面使用简单的方式完成grep）并使用命令替换，将此链接传递给第二个curl命令。

curl -OJ \
    $(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -om1 '[^"]*entry-download/[^"]*')

希望这能给你一些可以借鉴的东西。

grep使用的选项：

-o/--only-matching仅输出匹配的模式，而不是包含该模式的整行
-m 1/--max-count=1在包含匹配的第一行之后停止搜索输入
要匹配的模式：[^"]*entry-download/[^"]*：下载链接似乎都给出了href="https://www.transportfever.net/filebase/index.php?entry-download/<number><...>"- 所以上面的模式匹配似乎足够了：零个或多个除双引号之外的任何字符"，后跟entry-download/，再后跟零个或多个除双引号之外的任何字符"

curl使用的选项（第一遍 - 在替换内）：

-f/--fail如果我们收到一个，则不输出任何内容4/5xxhttp回复– 请求失败，我们不想 grep 告诉我们失败的 html 文档
-s/--silent这是第一遍，我们不想看到进度条或任何东西

第二遍curl选项 - 这些下载链接使用content-disposition标头来告诉我们文件名，因此：

-O/--remote-name使用与远程文件相同的名称保存文件
-J/--remote-header-name允许-O选择使用服务器指定的 Content-Disposition 文件名，而不是从 URL 中提取文件名

实际上有多个entry-download/链接 - 要下载所有链接，我们可以删除-m1并调整要使用的grep第二个选项，如下所示：curl--remote-name-all

curl --remote-name-all -J \
    $(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -o '[^"]*entry-download/[^"]*')

文件冲突检查：

如果我们想content-disposition提前知道头部描述的文件名，则需要一个额外的步骤。我们可以使用curl来发送head请求：

# get first url from the page, storing it to
# the parameter 'url' so we can use it again later
url=$(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -om1 '[^" ]*entry-download/[^" ]*')

# head request to determine filename
filename=$(curl -Is "$url" | grep -iom1 '^content-disposition:.*filename="[^"]*' | grep -o '[^"]*$')

# 'if' statement using the 'test' / '[' command as the condition
if test -e "$filename"; then
    echo "$filename exists!"
else
    # a file named $filename doesn't exit,
    # so we'll download it
    curl -o "$filename" "$url"
fi

这是一个简单的例子，它在尝试下载之前检查冲突的文件
并不是真的有必要，因为curl -J不会覆盖现有文件，但我怀疑您想检查是否存在"$filename"– 可能没有.zip: "${filename%.zip}"– 在其他目录中，或者在某些文本文件中

在上述基础上，如果您想对所有提取的entry-download/网址执行此操作：

# extract all urls, placing them in an array parameter 'urls'
urls=( $(curl -fs \
    'https://www.transportfever.net/filebase/index.php?filebase/80-transport-fever-2/' | \
    grep -o '[^" ]*entry-download/[^" ]*') )

# loop over extracted urls
for i in "${urls[@]}"; do
    # do filename extraction for "$i"
    # use filename to determine if you want to download "$i"
done

从网站下载文件（游戏模组）

答案1

文件冲突检查：

相关内容