bash 中的嵌套循环、if 条件和后台作业

bash 中的嵌套循环、if 条件和后台作业

我有一个 bash 提取函数,我正在尝试并行化。它的工作是查找并提取嵌套档案。理想情况下,我希望 if 评估及其所有操作都发送到后台。问题是 if 评估的操作项需要按顺序完成,所以我不能只添加一个“&”。到 if 中的命令。有没有办法将整个 if 评估封装到单个后台作业中并让命令按顺序执行?

这是当前的工作功能:

extract () {
IFS=$'\n'
trap exit SIGINT SIGTERM
for ext in zip rar tar.gz tar.bz2 tbz tgz 7z tar; do
    while [ "`find . -type f -iname "*.$ext" | wc -l`" -gt 0 ]; do
        for z in `find . -type f -iname "*.$ext"`; do
            if [ ! -d "`echo "$z" | rev | cut -c$(expr ${#ext} + 2)- | rev`" ]; then
                echo "Extracting `basename "$z"` ..."
                mkdir -p `echo "$z" | rev | cut -c$(expr ${#ext} + 2)- | rev`
                if [[ "$z" =~ ^.*\.7z$ ]]; then 7z x "$z" -o"`echo "$z" | rev | cut -c$(expr ${#ext} + 2)- | rev`" > /dev/null
                elif [[ "$z" =~ ^.*\.zip$ ]]; then unzip -uoLq "$z" -d `echo "$z" | rev | cut -c$(expr ${#ext} + 2)- | rev` 2>&1 | grep -ive warning
                elif [[ "$z" =~ ^.*\.tar\.xz$ ]] || [[ "$z" =~ ^.*\.tar\.gz$ ]] || [[ "$z" =~ ^.*\.tar\.bz2$ ]] || [[ "$z" =~ ^.*\.tgz$ ]] || [[ "$z" =~ ^.*\.tbz$ ]] || [[ "$z" =~ ^.*\.tar$ ]] ; then tar -xaf "$z" -C `echo "$z" | rev | cut -c$(expr ${#ext} + 2)- | rev` 
                elif [[ "$z" =~ ^.*\.rar$ ]]; then unrar x -y -o+ "$z" `echo "$z" | rev | cut -c$(expr ${#ext} + 2)- | rev`
                fi
                rm -f "$z"
            else echo "Omitting `basename "$z"`, directory with that name already exists."; rm -f "$z"
            fi 
        done
    done
done 
}

另外,我很好奇是否有任何方法可以在不删除源档案的情况下执行提取。我目前这样做是为了防止无限循环。目前,该功能足够可靠,不会丢失任何数据,但为了安全起见,我想避免删除任何内容。

答案1

为什么多次运行相同的 find 命令,两次对于每个扩展?您可以只生成一个 find 命令,该命令只会遍历目录树一次:

EXT_REGEX='.*(zip|rar|tar.gz|tar.bz2|tbz|tgz|7z|tar)$'
find . -regextype posix-egrep -iregex $EXT_REGEX

现在,您根本不需要嵌套循环,当然也不需要while导致无限循环问题的嵌套循环。

其次,您的代码因文件名中包含空格而被破坏。您可以通过添加来修复该问题

IFS=''

(停止for z in ...在空格上分割输出)。

最后,如果您&在每个分支的末尾添加一个if/elif,它们将并行运行。

顺便说一句,所有这些都echo "$z" | rev应该完成什么?您是否以某种方式获得多行文件名?

答案2

感谢@Useless 和@Orion 的建议,我现在已经将该函数提交了。现在,它会在后台生成所有提取内容,不再删除源文件,并且对我来说比其前身快了 25% 以上。 @Gilles 指出并行化并不适合所有人,因为这相当昂贵的存储。但这对我来说更好,如果您发现可以使用这个脚本,我将在下面提供它:

extract () { # Extracts all archives and any nested archives of specified directory into a new child directory named after the archive.
IFS=$'\n'
trap "rm $skipfiles ; exit" SIGINT SIGTERM
shopt -s nocasematch # Allows case-insensitive regex matching
echo -e "\n=====Extracting files====="
skipfiles=`mktemp` ; echo -e '\e' > $skipfiles # This creates a temp file to keep track of files that are already processed. Because of how it is read by grep, it needs an initial search string to omit from the found files. I opted for a literal escape character because who would name a file with that?
while [ "`find "$1/" -type f -regextype posix-egrep -iregex '^.*\.(tar\.gz|tar\.bz2|tar\.xz|tar|tbz|tgz|zip|rar|7z)$' | grep -ivf $skipfiles | wc -l`" -gt 0 ]; do #The while loop ensures that nested archives will be extracted. Its find operation needs to be separate from the find for the for loop below because it will change.
    for z in `find "$1/" -type f -regextype posix-egrep -iregex '^.*\.(tar\.gz|tar\.bz2|tar\.xz|tar|tbz|tgz|zip|rar|7z)$' | grep -ivf $skipfiles`; do
        destdir=`echo "$z" | sed -r 's/\.(tar\.gz|tar\.bz2|tar\.xz|tar|tbz|tgz|zip|rar|7z)$//'` # This removes the extension from the source filename so we can extract the files to a new directory named after the archive.
        if [ ! -d "$destdir" ]; then
            echo "Extracting `basename $z` into `basename $destdir` ..."
            mkdir -p "$destdir"
            if [[ "$z" =~ ^.*\.7z$ ]]; then 7z x "$z" -o"$destdir" > /dev/null & 
            elif [[ "$z" =~ ^.*\.rar$ ]]; then unrar x -y -o+ "$z" "$destdir" &
            elif [[ "$z" =~ ^.*\.zip$ ]]; then unzip -uoLq "$z" -d "$destdir" 2>/dev/null &
            elif [[ "$z" =~ ^.*\.(tar\.gz|tar\.bz2|tar\.xz|tar|tbz|tgz)$ ]] ; then tar -xaf "$z" -C "$destdir" &
            fi
            echo `basename "$z"` >> $skipfiles # This adds the name of the extracted file to the omission list for the next pass.
        else echo "Omitting `basename $z`, directory with that name already exists."; echo `basename "$z"` >> $skipfiles # Same as last line
        fi
    done
    wait # This will wait for all files in this pass to finish extracting before the next one.
done
rm "$skipfiles" # Removes temporary file
}

相关内容