我可以计算出可供公众访问的网站的总大小吗？

Question

根据类似的问答 -在执行 wget 之前将文件的大小获取到 wget 中？- 我制作了 bash shell 包装器脚本，它可以完全满足您的需要。:)

最新的代码库可以在 Github 上找到：

https://github.com/mariomaric/website-size

#!/bin/bash
# Info: https://github.com/mariomaric/website-size#readme

# Prepare wget logfile
log=/tmp/wget-website-size-log

# Do the spider magic
echo "### Crawling ${!#} website... ###"
sleep 2s
echo "### This will take some time to finish, please wait. ###"

wget \
  --recursive --level=inf \
  --spider --server-response \
  --no-directories \
  --output-file="$log" "$@"

echo "Finished with crawling!"
sleep 1s

# Check if prepared logfile is used
if [ -f "$log" ]; then
    # Calculate and print estimated website size
    echo "Estimated size: $(\
        grep -e "Content-Length" "$log" | \
        awk '{sum+=$2} END {printf("%.0f", sum / 1024 / 1024)}'\
    ) Mb"

    # Delete wget log file
    rm "$log"
else
    echo "Unable to calculate estimated size."
fi  

exit

此外，这个答案也有很大的帮助：Shell 命令对整数求和，每行一个？

Answer 1

根据类似的问答 -在执行 wget 之前将文件的大小获取到 wget 中？- 我制作了 bash shell 包装器脚本，它可以完全满足您的需要。:)

最新的代码库可以在 Github 上找到：

https://github.com/mariomaric/website-size

#!/bin/bash
# Info: https://github.com/mariomaric/website-size#readme

# Prepare wget logfile
log=/tmp/wget-website-size-log

# Do the spider magic
echo "### Crawling ${!#} website... ###"
sleep 2s
echo "### This will take some time to finish, please wait. ###"

wget \
  --recursive --level=inf \
  --spider --server-response \
  --no-directories \
  --output-file="$log" "$@"

echo "Finished with crawling!"
sleep 1s

# Check if prepared logfile is used
if [ -f "$log" ]; then
    # Calculate and print estimated website size
    echo "Estimated size: $(\
        grep -e "Content-Length" "$log" | \
        awk '{sum+=$2} END {printf("%.0f", sum / 1024 / 1024)}'\
    ) Mb"

    # Delete wget log file
    rm "$log"
else
    echo "Unable to calculate estimated size."
fi  

exit

此外，这个答案也有很大的帮助：Shell 命令对整数求和，每行一个？

我可以计算出可供公众访问的网站的总大小吗？

答案1

相关内容