查找重复文件夹(非文件)的实用程序

查找重复文件夹(非文件)的实用程序

我知道rdfind可以找到两个目录中的重复文件。但我需要一个类似的实用程序来查找两个主目录中的重复文件夹(相对于主目录具有相同名称和相同路径的文件夹)。有没有实用程序可以执行这个简单的任务?

**Example:**
$ tree
.
├── maindir1
│   ├── dir space
│   │   ├── dir1
│   │   └── dir2
│   ├── dir1
│   ├── dir2
│   │   └── new\012line
│   ├── dir3
│   │   └── dir5
│   └── dir4
│       └── dir6
├── maindir2
│   ├── dir space
│   │   └── dir2
│   ├── dir1
│   ├── dir2
│   │   └── new\012line
│   ├── dir5
│   │   └── dir6
│   ├── dir6
│   └── new\012line
├── file
└── new\012line

笔记: 在上面的例子中,第一级(深度 1)中唯一重复的文件夹是:

maindir1/dir space/ & maindir2/dir space/
maindir1/dir1/ & maindir2/dir1/
maindir1/dir2/ & maindir2/dir2/

在第二级(深度 2)中,唯一重复的文件夹是:

maindir1/dir space/dir2/ & maindir2/dir space/dir2/
maindir1/dir2/new\012line/ & maindir2/dir2/new\012line/

请注意maindir1/dir3/dir5/maindir2/dir5/不是重复,并且maindir1/dir4/dir6/maindir2/dir5/dir6/不是重复。

答案1

我不知道有什么特定于目录的实用程序(但诸如fslintfdupes应该列出目录),但编写脚本相当简单:

#!/usr/bin/env bash

## Declare $dirs and $count as associative arrays
declare -A dirs
declare -A count

find_dirs(){
    ## Make ** recurse into subdirectories
    shopt -s globstar
    for d in "$1"/**
    do
    ## Remove the top directory from the dir's path
    dd="${d#*/}"
    ## If this is a directory, and is not the top directory
    if [[ -d "$d" && "$dd" != "" ]]
    then
        ## Count the number of times it's been seen
        let count["$dd"]++
        ## Add it to the list of paths with that name.
        ## I am using the `&` to separate directory entries
        dirs["$dd"]="${dirs[$dd]} & $d" 
    fi

    done
}

## Iterate over the list of paths given as arguments
for target in "$@"
do
    ## Run the find_dirs function on each of them
    find_dirs "$target"
done

## For each directory found by find_dirs
for d in "${!dirs[@]}"
do
    ## If this name has been seen more than once
    if [[ ${count["$d"]} > 1 ]]
    then
    ## Print the name with pretty colors
    printf '\033[01;31m+++ NAME: "%s" +++\033[00m\n' "$d"
    ## Print the paths with that name
    printf "%s\n" "${dirs[$d]}" | sed 's/^ & //'
    fi
done

上述脚本可以处理任意目录名(包括名称中带有空格甚至换行符的目录名),并将递归到任意数量的子目录中。例如,在此目录结构中:

$ tree
.
├── maindir1
│   ├── dir1
│   ├── dir2
│   │   └── new\012line
│   ├── dir3
│   │   └── dir5
│   ├── dir4
│   │   └── dir6
│   └── dir space
│       ├── dir1
│       └── dir2
└── maindir2
    ├── dir1
    ├── dir2
    │   └── new\012line
    ├── dir5
    │   └── dir6
    ├── dir6
    ├── dir space
    │   └── dir2
    └── new\012line

它将返回以下内容:

显示脚本输出的屏幕截图

相关内容