我知道rdfind
可以找到两个目录中的重复文件。但我需要一个类似的实用程序来查找两个主目录中的重复文件夹(相对于主目录具有相同名称和相同路径的文件夹)。有没有实用程序可以执行这个简单的任务?
**Example:**
$ tree
.
├── maindir1
│ ├── dir space
│ │ ├── dir1
│ │ └── dir2
│ ├── dir1
│ ├── dir2
│ │ └── new\012line
│ ├── dir3
│ │ └── dir5
│ └── dir4
│ └── dir6
├── maindir2
│ ├── dir space
│ │ └── dir2
│ ├── dir1
│ ├── dir2
│ │ └── new\012line
│ ├── dir5
│ │ └── dir6
│ ├── dir6
│ └── new\012line
├── file
└── new\012line
笔记: 在上面的例子中,第一级(深度 1)中唯一重复的文件夹是:
maindir1/dir space/ & maindir2/dir space/
maindir1/dir1/ & maindir2/dir1/
maindir1/dir2/ & maindir2/dir2/
在第二级(深度 2)中,唯一重复的文件夹是:
maindir1/dir space/dir2/ & maindir2/dir space/dir2/
maindir1/dir2/new\012line/ & maindir2/dir2/new\012line/
请注意maindir1/dir3/
dir5/
和maindir2/
dir5/
是不是重复,并且maindir1/dir4/
dir6/
和maindir2/dir5/
dir6/
是不是重复。
答案1
我不知道有什么特定于目录的实用程序(但诸如fslint
或fdupes
应该还列出目录),但编写脚本相当简单:
#!/usr/bin/env bash
## Declare $dirs and $count as associative arrays
declare -A dirs
declare -A count
find_dirs(){
## Make ** recurse into subdirectories
shopt -s globstar
for d in "$1"/**
do
## Remove the top directory from the dir's path
dd="${d#*/}"
## If this is a directory, and is not the top directory
if [[ -d "$d" && "$dd" != "" ]]
then
## Count the number of times it's been seen
let count["$dd"]++
## Add it to the list of paths with that name.
## I am using the `&` to separate directory entries
dirs["$dd"]="${dirs[$dd]} & $d"
fi
done
}
## Iterate over the list of paths given as arguments
for target in "$@"
do
## Run the find_dirs function on each of them
find_dirs "$target"
done
## For each directory found by find_dirs
for d in "${!dirs[@]}"
do
## If this name has been seen more than once
if [[ ${count["$d"]} > 1 ]]
then
## Print the name with pretty colors
printf '\033[01;31m+++ NAME: "%s" +++\033[00m\n' "$d"
## Print the paths with that name
printf "%s\n" "${dirs[$d]}" | sed 's/^ & //'
fi
done
上述脚本可以处理任意目录名(包括名称中带有空格甚至换行符的目录名),并将递归到任意数量的子目录中。例如,在此目录结构中:
$ tree
.
├── maindir1
│ ├── dir1
│ ├── dir2
│ │ └── new\012line
│ ├── dir3
│ │ └── dir5
│ ├── dir4
│ │ └── dir6
│ └── dir space
│ ├── dir1
│ └── dir2
└── maindir2
├── dir1
├── dir2
│ └── new\012line
├── dir5
│ └── dir6
├── dir6
├── dir space
│ └── dir2
└── new\012line
它将返回以下内容: