我希望通过排除整个目录和包含“.protect”文件的目录的内容,在 Linux 中使用 rsync 为我的 NAS 实现约 20TB 的清理功能
我在子文件夹中生成非常大的缓存,例如
缓存/simulation_v001/reallybigfiles_*.bgeo
缓存/simulation_v002/reallybigfiles_*.bgeo
缓存/simulation_v003/reallybigfiles_*.bgeo
如果存在这样的文件 -cache/simulation_v002/.protect
然后我想构建一个 rsync 操作,将所有文件夹移动到临时/回收位置,不包括缓存/simulation_v002/ 及其所有内容。
我之前用 python 做过类似的事情,但我很好奇是否可以使用 rsync 或其他方法来简化操作。
答案1
感谢 cas 的提示,我能够创建此工作流程来使用 bash 脚本解决问题。它并不理想,因为如果它采取行动以实现更快的操作会更好(我希望 rsync 具有这种能力)。该脚本将使用 find 在当前文件夹下搜索文件,创建排除列表,然后使用基本卷中的 rsync 将所有其他文件夹移动到垃圾文件夹,保留下面的完整路径,以便可以非破坏性地恢复任何错误。
如果此解决方案位于 git dev 分支中,则链接到当前状态 -https://github.com/firehawkvfx/openfirehawk-houdini-tools/blob/dev/scripts/modules/trashcan.sh
#!/bin/bash
# trash everything below the current path that does not have a .protect file in
# the folder. it should normally only be run from the folder such as
# 'job/seq/shot/cache' to trash all data below this path.
# see opmenu and firehawk_submit.py for tools to add protect files based on
# a top net tree for any given hip file.
argument="$1"
echo ""
ARGS=''
if [[ -z $argument ]] ; then
echo "DRY RUN. To move files to trash, use argument -m after reviewing the exclude_list.txt and you are sure it lists everything you wish to protect from being moved to the trash."
echo ""
ARGS1='--remove-source-files'
ARGS2='--dry-run'
else
case $argument in
-m|--move)
echo "MOVING FILES TO TRASH."
echo ""
ARGS1='--remove-source-files'
ARGS2=''
;;
*)
raise_error "Unknown argument: ${argument}"
return
;;
esac
fi
current_dir=$(pwd)
echo "current dir $current_dir"
base_dir=$(pwd | cut -d/ -f1-2)
echo "base_dir $base_dir"
source=$(realpath --relative-to=$base_dir $current_dir)/
echo "source $source"
target=trash/
echo "target $target"
# ensure trash exists at base dir.
mkdir -p $base_dir/$target
echo ""
echo "Build exclude_list.txt contents with directories containing .protect files"
find . -name .protect -print0 |
while IFS= read -r -d '' line; do
path=$(realpath --relative-to=. "$line")
dirname $path
done > exclude_list.txt
path_to_list=$(realpath --relative-to=. exclude_list.txt)
echo $path_to_list >> exclude_list.txt
cat exclude_list.txt
cd $base_dir
# run this command from the drive root, eg /prod.
rsync -a $ARGS1 --prune-empty-dirs --inplace --relative --exclude-from="$current_dir/exclude_list.txt" --include='*' --include='*/' $source $target $ARGS2 -v
cd $current_dir