如何轻松更新 md5sum 列表？

Question 1

如果这将是一个持续的过程，那么您将需要两个文件，即旧文件和新文件（下次将成为旧文件）。

#!/bin/sh
# change directory to either first argument or to current directory
cd ${1:-"."} || exit 1 # if cannot cd, then exit
# get the md5 values for all the files in the directory tree
find . -type f -not -name .md5sum.last -exec md5sum {} \; | sort > .md5sum.tmp
# if called before, then get only the differences in the newer
if [ -f .md5sum.last ]; then
    comm -13 .md5sum.last .md5sum.tmp
else  # otherwise show all the output
    cat .md5sum.tmp
fi
# replace the older with the current for next time
mv .md5sum.tmp .md5sum.last

和sort是comm -13关键。排序很明显，但是comm（“common”的缩写）将显示第一个文件（第 1 列）、第二个文件（第 2 列）或两个文件（第 3 列）中的行。该-13选项表示“删除第一列和第三列”，只留下不属于旧列且两者不共同的行。不幸的是，如果您不能信任文件上的时间戳，那么对于大型目录树来说，这将是一个非常密集的过程。

Answer

如果这将是一个持续的过程，那么您将需要两个文件，即旧文件和新文件（下次将成为旧文件）。

#!/bin/sh
# change directory to either first argument or to current directory
cd ${1:-"."} || exit 1 # if cannot cd, then exit
# get the md5 values for all the files in the directory tree
find . -type f -not -name .md5sum.last -exec md5sum {} \; | sort > .md5sum.tmp
# if called before, then get only the differences in the newer
if [ -f .md5sum.last ]; then
    comm -13 .md5sum.last .md5sum.tmp
else  # otherwise show all the output
    cat .md5sum.tmp
fi
# replace the older with the current for next time
mv .md5sum.tmp .md5sum.last

和sort是comm -13关键。排序很明显，但是comm（“common”的缩写）将显示第一个文件（第 1 列）、第二个文件（第 2 列）或两个文件（第 3 列）中的行。该-13选项表示“删除第一列和第三列”，只留下不属于旧列且两者不共同的行。不幸的是，如果您不能信任文件上的时间戳，那么对于大型目录树来说，这将是一个非常密集的过程。

Question 2

我认为最简单的方法是将文件的校验和存储_my_file_在文件中_my_file_.md5，避免将所有校验和存储在单个文件中。这样，就可以更简单地知道某个校验和以前是否已经被计算机输入过。

但是，如果您仅将文件添加到闪存驱动器（从不修改，也许删除，但从不添加以前曾经存在过的文件），您可以：

find _your_drive_path_ -type f |
  while read file; do
    grep -q $file _your_md5_file_ || md5sum $file >> _your_md5_file_
  done

这是grep您的校验和文件很多次，可以通过对文件列表进行排序并保持您的校验和文件按文件名排序来进行优化，但如果您不需要这种优化，为什么要担心它的复杂性......

Answer

我认为最简单的方法是将文件的校验和存储_my_file_在文件中_my_file_.md5，避免将所有校验和存储在单个文件中。这样，就可以更简单地知道某个校验和以前是否已经被计算机输入过。

但是，如果您仅将文件添加到闪存驱动器（从不修改，也许删除，但从不添加以前曾经存在过的文件），您可以：

find _your_drive_path_ -type f |
  while read file; do
    grep -q $file _your_md5_file_ || md5sum $file >> _your_md5_file_
  done

这是grep您的校验和文件很多次，可以通过对文件列表进行排序并保持您的校验和文件按文件名排序来进行优化，但如果您不需要这种优化，为什么要担心它的复杂性......

Question 3

如果您不能信任时间戳，那么实际上没有办法只处理已更改的文件。只需重复原来的find命令即可。

我会将新MD5SUM文件保存到临时位置，然后保存diff旧文件和新文件，看看在将更新的文件复制到闪存之前发生了什么变化。您可能需要对文件进行排序才能获得有用的差异。

Answer

如果您不能信任时间戳，那么实际上没有办法只处理已更改的文件。只需重复原来的find命令即可。

我会将新MD5SUM文件保存到临时位置，然后保存diff旧文件和新文件，看看在将更新的文件复制到闪存之前发生了什么变化。您可能需要对文件进行排序才能获得有用的差异。

如何轻松更新 md5sum 列表？

答案1

答案2

答案3

相关内容