我试图实现的目标是从网站中提取 HTML 代码,并将其写入文件。完成此操作后,我想循环它再次执行并将其写入另一个文件。写入第二个文件后,我希望它将两个文件进行比较,看看是否有任何更改。这是我到目前为止所拥有的,但它不起作用:
#!/bin/bash
echo "Hopefully this will do everything at once!"
while true;
do
wget -q -O - http://website.com > websitebaseline.txt
if -e websitebaseline.txt
then
wget -q -O - http://www.website.com > websitechange.txt
echo "Update to websitechange.txt has been made"
if !-e websitebaseline.txt
then
wget -q -O - http://www.website.com > webbaseline.txt
echo "Baseline has been created"
if -e websitebaseline.txt websitechange.txt
then diff -y websitebaseline.txt websitechange.txt --supress-common-lines > Changeinsite.txt
if !-e websitebaseline.txt
then
wget -q -O - http://www.website.com > websitebaseline.txt echo "Baseline has been created"
elif !-e websitechange.txt
then
wget -q- O - http://websitename.com > websitenamechange.txt
echo "Update has been made"
sleep 100;
done
答案1
让事情变得过于复杂。
#!/bin/bash
left=$(mktemp)
right=$(mktemp)
url="http://url.example.com/"
trap 'rm -f "$left" "$right"' EXIT
for file in "$left" "$right"; do
wget -q -O "$file" "$url"
done
if diff "$left" "$right" > /dev/null 2>&1; then
echo "Changes detected in successive retrievals of '$url'."
fi
可以使用类似的机制来逐步记录随时间的变化:
left=$(mktemp)
right=$(mktemp)
url="http://url.example.com/"
trap 'rm -f "$left" "$right"' EXIT
# Establish the "baseline":
wget -q -O "$left" "$url"
# Okay, now check for updates forever:
while sleep 30; do
wget -q -O "$right" "$url"
if diff "$left" "$right" > /dev/null 2>&1; then
echo "$(date) - Changes detected in '$url'."
cp "$right" "$left"
fi
done
答案2
检测两个文件之间差异的最简单方法(无论变化有多小)就是比较它们的校验和。为了在这里演示,我只是使用“md5sum”命令来生成每个请求的 md5 哈希值。
#!/bin/bash
wget -q website.com -O site.txt
baseline=$(md5sum site.txt)
echo first request checksum: $baseline
rm site.txt
wget -q website.com -O site.txt
change=$(md5sum site.txt)
echo second request checksum: $change
rm site.txt
该脚本的输出将是每个请求的 md5 哈希值,您可以轻松查看哈希值是否相同或不同。