我希望我的机器自动下载一些文件。这不一定非常高效。所以我决定使用 bash 脚本来完成此操作。
到目前为止,当我对 URL 进行编码时,它几乎没有问题。但我想以不规则的顺序检索文件,并且我想我会使用简单的变量。如何将随机数放入我的变量中?
我的方法
data_link0="https://example.com/target1.html"
data_link1="https://example.com/target2.html"
data_link2="https://example.com/target3.html"
data_link3="https://example.com/target4.html"
useragent0="Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_1 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/14A403 Safari/602.1"
useragent1="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/604.5.6 (KHTML, like Gecko) Version/11.0.3 Safari/604.5.6"
useragent3="Mozilla/5.0 (Windows 7; ) Gecko/geckotrail Firefox/firefoxversion"
wget --user-agent="$user_agent[$((RANDOM % 3))]" "$datei_link$((RANDOM % 3))"
不幸的是不起作用。
答案1
至于您需要检索所有网址,更好的方法是使用嘘(GNU/linux coreutils)(或sort -R
coreutils):
shuf file | xargs wget
文件 :
$ cat file
"https://example.com/target1.html"
"https://example.com/target2.html"
"https://example.com/target3.html"
"https://example.com/target4.html"
man 1 shuf
姓名
shuf - 生成随机排列
新评论,新需求,新代码:
(需要随机用户代理)
$ cat uas
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.52 Safari/537.36
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.52 Safari/537.36 OPR/15.0.1147.100
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; AS; rv:11.0) like Gecko
代码 :
shuf file | while read url; do
wget --user-agent="$(shuf -n1 uas)" "$url"
done
如果您想保持自己的方式(一个网址):
data_link=(
"https://example.com/target1.html"
"https://example.com/target2.html"
"https://example.com/target3.html"
"https://example.com/target4.html"
)
user_agent=(
"Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_1 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/14A403 Safari/602.1"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/604.5.6 (KHTML, like Gecko) Version/11.0.3 Safari/604.5.6"
"Mozilla/5.0 (Windows 7; ) Gecko/geckotrail Firefox/firefoxversion"
)
wget --user-agent="${user_agent[RANDOM % ${#user_agent[@]} ]}" "${data_link[RANDOM % ${#data_link[@]}]}"
您对所有网址和用户代理(均为随机)的方式:
for i in $(seq 0 $((${#data_link[@]} -1)) | shuf); do
wget -U "${user_agent[RANDOM % ${#user_agent[@]}]}" "${data_link[i]}"
done
答案2
不要为每个字符串定义单独的变量,而是定义一个数组。用于${ar[123]}
访问数组的元素 123ar
并${#ar[@]}
获取数组的大小。
data_link=(
"https://example.com/target1.html"
"https://example.com/target2.html"
"https://example.com/target3.html"
"https://example.com/target4.html"
)
user_agent=(
"Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_1 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/14A403 Safari/602.1"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/604.5.6 (KHTML, like Gecko) Version/11.0.3 Safari/604.5.6"
"Mozilla/5.0 (Windows 7; ) Gecko/geckotrail Firefox/firefoxversion"
)
wget --user-agent="${user_agent[RANDOM % ${#user_agent[@]}]}" "${data_link[RANDOM % ${#data_link[@]}]}"