当目录中有超过 7 个文件时，删除最旧的文件？

Question 1

介绍

让我们回顾一下这个问题：任务是检查特定目录中的文件数量是否超过一定数量，并删除其中最旧的文件。起初，我们似乎需要遍历目录树一次计算文件数量，然后再次遍历它以查找所有文件的最后修改时间，对它们进行排序，并提取最旧的文件进行删除。但考虑到在这个特定情况下，OP 提到当且仅当文件数量超过 7 时才删除文件，这表明我们可以简单地获取一次带有时间戳的所有文件列表，并将它们存储到变量中。

这种方法的问题在于文件名存在危险。正如评论中提到的，从不建议解析ls命令，因为输出可能包含特殊字符并破坏脚本。但是，有些人可能知道，在类 Unix 系统中（以及 Ubuntu 中），每个文件都有与之关联的 inode 号。因此，创建一个带有时间戳（以秒为单位，便于数字排序）加上用换行符分隔的 inode 号的条目列表将确保我们安全地解析文件名。删除最旧的文件名也可以通过这种方式完成。

下面提供的脚本完全按照上面描述的执行。

脚本

重要的：请阅读评论，尤其是delete_oldest功能部分。

#!/bin/bash
# Uncomment line below for debugging
#set -xv
delete_oldest(){
     # reads a line from stdin, extracts file inode number
     # and deletes file to which inode belongs
     # !!! VERY IMPORTANT !!!
     # The actual command to delete file is commented out.
     # Once you verify correct execution, feel free to remove
     # leading # to uncomment it
     read timestamp file_inode
     find "$directory" -type f -inum "$file_inode"  -printf "Deleted %f\n" 
     # find "$directory" -type f -inum "$file_inode"  -printf "Deleted %f\n"  -delete
}

get_files(){
    # Wrapper function around get files. Ensures we're working
    # with files and only on one specific level of directory tree
    find "$directory" -maxdepth 1 -type f  -printf "%Ts\t%i\n" 
}

filecount_above_limit(){
    # This function counts number of files obtained
    # by get_files function. Returns true if file
    # count is greater than what user specified as max
    # value 
    num_files=$(wc -l <<< "$file_inodes"  )
    if [ $num_files -gt "$max_files" ];
    then
        return 0
    else
        return 1
    fi
}

exit_error(){
    # Print error string and quit
    printf ">>> Error: %s\n"  "$1" > /dev/stderr 
    exit 1
}

main(){
    # Entry point of the program. 
    local directory=$2
    local max_files=$1

    # If directory is not given
    if [ "x$directory" == "x"  ]; then
        directory="."
    fi

    # check arguments for errors
    [ $# -lt 1  ] && exit_error "Must at least have max number of files"
    printf "%d" $max_files &>/dev/null || exit_error "Argument 1 not numeric"
    readlink -e "$directory" || exit_error "Argument 2, path doesn't exist"

    # This is where actual work is being done
    # We traverse directory once, store files into variable.
    # If number of lines (representing file count) in that variable
    # is above max value, we sort numerically the inodes and pass them
    # to delete_oldest, which removes topmost entry from the sorted list
    # of lines.
    local file_inodes=$(get_files)
    if filecount_above_limit 
    then
        printf  "@@@ File count in %s is above %d." "$directory" $max_files
        printf "Will delete oldest\n"
        sort -k1 -n <<< "$file_inodes" | delete_oldest
    else
        printf "@@@ File count in %s is below %d."  "$directory" $max_files
        printf "Exiting normally"
    fi
}

main "$@"

使用示例

$ ./delete_oldest.sh 7 ~/bin/testdir                                                                                     
/home/xieerqi/bin/testdir
@@@ File count in /home/xieerqi/bin/testdir is below 7.Exiting normally
$ ./delete_oldest.sh 7 ~/bin                                                                                             
/home/xieerqi/bin
@@@ File count in /home/xieerqi/bin is above 7.Will delete oldest
Deleted typescript

附加讨论

这可能很可怕……而且很长……而且看起来它做的太多了。也许确实如此。事实上，所有的事情都可以放在一行命令行上（这是 muru 在聊天处理文件名。echo用于代替rm演示目的）：

find /home/xieerqi/bin/testdir/ -maxdepth 1 -type f -printf "%T@ %p\0" | sort -nz | { f=$(awk  'BEGIN{RS=" "}NR==2{print;next}'  ); echo "$f" ; }

然而，我对它有几点不满：

它无条件删除最旧的文件，而不检查目录中的文件数量
它直接处理文件名（这需要我使用尴尬的awk命令，这可能会破坏带有空格的文件名）
管道过多（管子太多）

因此，虽然我的脚本对于简单的任务来说看起来非常庞大，但它会进行更多检查，旨在解决复杂文件名的问题。用 Perl 或 Python 实现可能更短、更符合习惯（我绝对可以做到，我只是碰巧选择了bash这个问题）。

Answer

介绍

让我们回顾一下这个问题：任务是检查特定目录中的文件数量是否超过一定数量，并删除其中最旧的文件。起初，我们似乎需要遍历目录树一次计算文件数量，然后再次遍历它以查找所有文件的最后修改时间，对它们进行排序，并提取最旧的文件进行删除。但考虑到在这个特定情况下，OP 提到当且仅当文件数量超过 7 时才删除文件，这表明我们可以简单地获取一次带有时间戳的所有文件列表，并将它们存储到变量中。

这种方法的问题在于文件名存在危险。正如评论中提到的，从不建议解析ls命令，因为输出可能包含特殊字符并破坏脚本。但是，有些人可能知道，在类 Unix 系统中（以及 Ubuntu 中），每个文件都有与之关联的 inode 号。因此，创建一个带有时间戳（以秒为单位，便于数字排序）加上用换行符分隔的 inode 号的条目列表将确保我们安全地解析文件名。删除最旧的文件名也可以通过这种方式完成。

下面提供的脚本完全按照上面描述的执行。

脚本

重要的：请阅读评论，尤其是delete_oldest功能部分。

#!/bin/bash
# Uncomment line below for debugging
#set -xv
delete_oldest(){
     # reads a line from stdin, extracts file inode number
     # and deletes file to which inode belongs
     # !!! VERY IMPORTANT !!!
     # The actual command to delete file is commented out.
     # Once you verify correct execution, feel free to remove
     # leading # to uncomment it
     read timestamp file_inode
     find "$directory" -type f -inum "$file_inode"  -printf "Deleted %f\n" 
     # find "$directory" -type f -inum "$file_inode"  -printf "Deleted %f\n"  -delete
}

get_files(){
    # Wrapper function around get files. Ensures we're working
    # with files and only on one specific level of directory tree
    find "$directory" -maxdepth 1 -type f  -printf "%Ts\t%i\n" 
}

filecount_above_limit(){
    # This function counts number of files obtained
    # by get_files function. Returns true if file
    # count is greater than what user specified as max
    # value 
    num_files=$(wc -l <<< "$file_inodes"  )
    if [ $num_files -gt "$max_files" ];
    then
        return 0
    else
        return 1
    fi
}

exit_error(){
    # Print error string and quit
    printf ">>> Error: %s\n"  "$1" > /dev/stderr 
    exit 1
}

main(){
    # Entry point of the program. 
    local directory=$2
    local max_files=$1

    # If directory is not given
    if [ "x$directory" == "x"  ]; then
        directory="."
    fi

    # check arguments for errors
    [ $# -lt 1  ] && exit_error "Must at least have max number of files"
    printf "%d" $max_files &>/dev/null || exit_error "Argument 1 not numeric"
    readlink -e "$directory" || exit_error "Argument 2, path doesn't exist"

    # This is where actual work is being done
    # We traverse directory once, store files into variable.
    # If number of lines (representing file count) in that variable
    # is above max value, we sort numerically the inodes and pass them
    # to delete_oldest, which removes topmost entry from the sorted list
    # of lines.
    local file_inodes=$(get_files)
    if filecount_above_limit 
    then
        printf  "@@@ File count in %s is above %d." "$directory" $max_files
        printf "Will delete oldest\n"
        sort -k1 -n <<< "$file_inodes" | delete_oldest
    else
        printf "@@@ File count in %s is below %d."  "$directory" $max_files
        printf "Exiting normally"
    fi
}

main "$@"

使用示例

$ ./delete_oldest.sh 7 ~/bin/testdir                                                                                     
/home/xieerqi/bin/testdir
@@@ File count in /home/xieerqi/bin/testdir is below 7.Exiting normally
$ ./delete_oldest.sh 7 ~/bin                                                                                             
/home/xieerqi/bin
@@@ File count in /home/xieerqi/bin is above 7.Will delete oldest
Deleted typescript

附加讨论

这可能很可怕……而且很长……而且看起来它做的太多了。也许确实如此。事实上，所有的事情都可以放在一行命令行上（这是 muru 在聊天处理文件名。echo用于代替rm演示目的）：

find /home/xieerqi/bin/testdir/ -maxdepth 1 -type f -printf "%T@ %p\0" | sort -nz | { f=$(awk  'BEGIN{RS=" "}NR==2{print;next}'  ); echo "$f" ; }

然而，我对它有几点不满：

它无条件删除最旧的文件，而不检查目录中的文件数量
它直接处理文件名（这需要我使用尴尬的awk命令，这可能会破坏带有空格的文件名）
管道过多（管子太多）

因此，虽然我的脚本对于简单的任务来说看起来非常庞大，但它会进行更多检查，旨在解决复杂文件名的问题。用 Perl 或 Python 实现可能更短、更符合习惯（我绝对可以做到，我只是碰巧选择了bash这个问题）。

Question 2

我认为@Serg 的回答很好，我正在向他和@muru 学习。我之所以回答这个问题，是因为我想探索并学习如何根据find“操作”的输出创建一个 shellscript 文件-print，以便根据文件的创建/修改时间对其进行排序。请提出改进和错误修复建议（如有必要）。

你会注意到，编程风格非常不同。我们可以在 Linux 中以多种方式做事 :-)

我制作了一个 bash shell 脚本来满足 OP、@beginner27_ 的要求，但将其修改为其他类似目的并不太难。

以下截图显示了测试过程：创建了 11 个文件，并运行了脚本（位于 ~/bin 中并具有执行权限）。我已从该行中删除了 # 字符

# bash "$cmd"

实现它

bash "$cmd"

脚本第一次发现并打印了 11 个文件，其中 7 个最新的文件带有蓝色背景，4 个最旧的文件带有红色背景。4 个最旧的文件被删除。脚本第二次运行（仅用于演示）。它发现并打印了剩余的 7 个文件，并且得到“没有要删除的备份文件”的满意结果。

根据时间对文件进行排序的关键find命令如下，

find "$bupdir" -type f -printf "%T+ %p\0"|sort -nrz > "$srtlst"

这是脚本文件。我将其保存为~/bin，rm_old_backups但您可以为其指定任何名称，只要它不与某些已存在的可执行程序名称冲突即可。

#!/bin/bash

keep=7  # set the number of files to keep

# variables and temporary files

inversvid="\0033[7m"
resetvid="\0033[0m"
redback="\0033[1;37;41m"
greenback="\0033[1;37;42m"
blueback="\0033[1;37;44m"

bupdir="$HOME/Backup"
cmd=$(mktemp)
srtlst=$(mktemp)
rmlist=$(mktemp)

# output to the screen

echo -e "$inversvid$0:
keep $keep backup files, remove the oldest files, if more than $keep are found $resetvid"

echo "Security fix: You must edit this script and remove the # character from
a line near the end of the script '# bash \"\$cmd\"' --> 'bash \"\$cmd\"'
otherwise the script will only show what it can do. Please test that it
works correctly before you remove that # character!"

# the crucial find command, that sorts the files according to time

find "$bupdir" -type f -printf "%T+ %p\0"|sort -nrz > "$srtlst"

# more output

echo -e "${inversvid}time-stamp                     file-name                               $resetvid"
echo -en "$blueback"
sed -nz -e 1,"$keep"p "$srtlst" | tr '\0' '\n'
echo -en "$resetvid"

echo -en "$redback"
sed -z -e 1,"$keep"d "$srtlst" | tr '\0' '\n' | tee "$rmlist"
echo -en "$resetvid"

# remove oldest files if more files than specified are found

if test -s "$rmlist"
then
 echo rm '"'$(sed -z -e 1,"$keep"d -e 's/[^ ]* //' -e 's/$/" "/' "$srtlst")'"'\
 | sed 's/" ""/"/' > "$cmd"
 cat "$cmd"

# uncomment the following line to really remove files 
# bash "$cmd"

 echo "The oldest backup files are removed"
else
 echo "There is no old backup file to remove"
fi

# remove temporary files

rm $cmd $srtlst $rmlist

Answer

我认为@Serg 的回答很好，我正在向他和@muru 学习。我之所以回答这个问题，是因为我想探索并学习如何根据find“操作”的输出创建一个 shellscript 文件-print，以便根据文件的创建/修改时间对其进行排序。请提出改进和错误修复建议（如有必要）。

你会注意到，编程风格非常不同。我们可以在 Linux 中以多种方式做事 :-)

我制作了一个 bash shell 脚本来满足 OP、@beginner27_ 的要求，但将其修改为其他类似目的并不太难。

以下截图显示了测试过程：创建了 11 个文件，并运行了脚本（位于 ~/bin 中并具有执行权限）。我已从该行中删除了 # 字符

# bash "$cmd"

实现它

bash "$cmd"

脚本第一次发现并打印了 11 个文件，其中 7 个最新的文件带有蓝色背景，4 个最旧的文件带有红色背景。4 个最旧的文件被删除。脚本第二次运行（仅用于演示）。它发现并打印了剩余的 7 个文件，并且得到“没有要删除的备份文件”的满意结果。

根据时间对文件进行排序的关键find命令如下，

find "$bupdir" -type f -printf "%T+ %p\0"|sort -nrz > "$srtlst"

这是脚本文件。我将其保存为~/bin，rm_old_backups但您可以为其指定任何名称，只要它不与某些已存在的可执行程序名称冲突即可。

#!/bin/bash

keep=7  # set the number of files to keep

# variables and temporary files

inversvid="\0033[7m"
resetvid="\0033[0m"
redback="\0033[1;37;41m"
greenback="\0033[1;37;42m"
blueback="\0033[1;37;44m"

bupdir="$HOME/Backup"
cmd=$(mktemp)
srtlst=$(mktemp)
rmlist=$(mktemp)

# output to the screen

echo -e "$inversvid$0:
keep $keep backup files, remove the oldest files, if more than $keep are found $resetvid"

echo "Security fix: You must edit this script and remove the # character from
a line near the end of the script '# bash \"\$cmd\"' --> 'bash \"\$cmd\"'
otherwise the script will only show what it can do. Please test that it
works correctly before you remove that # character!"

# the crucial find command, that sorts the files according to time

find "$bupdir" -type f -printf "%T+ %p\0"|sort -nrz > "$srtlst"

# more output

echo -e "${inversvid}time-stamp                     file-name                               $resetvid"
echo -en "$blueback"
sed -nz -e 1,"$keep"p "$srtlst" | tr '\0' '\n'
echo -en "$resetvid"

echo -en "$redback"
sed -z -e 1,"$keep"d "$srtlst" | tr '\0' '\n' | tee "$rmlist"
echo -en "$resetvid"

# remove oldest files if more files than specified are found

if test -s "$rmlist"
then
 echo rm '"'$(sed -z -e 1,"$keep"d -e 's/[^ ]* //' -e 's/$/" "/' "$srtlst")'"'\
 | sed 's/" ""/"/' > "$cmd"
 cat "$cmd"

# uncomment the following line to really remove files 
# bash "$cmd"

 echo "The oldest backup files are removed"
else
 echo "There is no old backup file to remove"
fi

# remove temporary files

rm $cmd $srtlst $rmlist

当目录中有超过 7 个文件时，删除最旧的文件？

答案1

介绍

脚本

使用示例

附加讨论

答案2

相关内容