您可以比较一个目录中的所有文件吗？

Question 1

如果你不需要比较它们而只需要知道如果它们不同，您可以通过 for 循环将目录中的每个文件与目录中的任何一个文件进行比较...

for i in ./*; do diff -q "$i" known-file; done

...其中known-file只是目录中的任何给定文件。如果没有输出，则说明所有文件都没有差异；否则您将获得与known-file.

Answer

如果你不需要比较它们而只需要知道如果它们不同，您可以通过 for 循环将目录中的每个文件与目录中的任何一个文件进行比较...

for i in ./*; do diff -q "$i" known-file; done

...其中known-file只是目录中的任何给定文件。如果没有输出，则说明所有文件都没有差异；否则您将获得与known-file.

Question 2

使用标准cksum实用程序以及awk：

find . -type f -exec cksum {} + | awk '!ck[$1$2]++ { print $3 }'

该cksum实用程序将为当前目录中的每个文件输出三列。第一个是校验和，第二个是文件大小，第三个是文件名。

该awk程序将创建一个数组，ck以校验和和大小为键。如果该密钥尚不存在，则打印文件名。

这意味着您将获得当前目录中具有唯一校验和+大小的文件名。如果您获得多个文件名，则这两个文件名具有不同的校验和和/或大小。

测试：

$ ls -l
total 8
-rw-r--r--  1 kk  kk  0 Oct  3 16:32 file1
-rw-r--r--  1 kk  kk  0 Oct  3 16:32 file2
-rw-r--r--  1 kk  kk  6 Oct  3 16:32 file3
-rw-r--r--  1 kk  kk  0 Oct  3 16:32 file4
-rw-r--r--  1 kk  kk  6 Oct  3 16:34 file5

$ find . -type f -exec cksum {} + | awk '!ck[$1$2]++ { print $3 }'
./file1
./file3

文件file1、file2和file4都是空的，但file3和file5有一些内容。该命令显示有两组文件：与相同的文件file1和与file3.

我们还可以准确地看到哪些文件是相同的：

$ find . -type f -exec cksum {} + | awk '{ ck[$1$2] = ck[$1$2] ? ck[$1$2] OFS $3 : $3 } END { for (i in ck) print ck[i] }'
./file3 ./file5
./file1 ./file2 ./file4

Answer

使用标准cksum实用程序以及awk：

find . -type f -exec cksum {} + | awk '!ck[$1$2]++ { print $3 }'

该cksum实用程序将为当前目录中的每个文件输出三列。第一个是校验和，第二个是文件大小，第三个是文件名。

该awk程序将创建一个数组，ck以校验和和大小为键。如果该密钥尚不存在，则打印文件名。

这意味着您将获得当前目录中具有唯一校验和+大小的文件名。如果您获得多个文件名，则这两个文件名具有不同的校验和和/或大小。

测试：

$ ls -l
total 8
-rw-r--r--  1 kk  kk  0 Oct  3 16:32 file1
-rw-r--r--  1 kk  kk  0 Oct  3 16:32 file2
-rw-r--r--  1 kk  kk  6 Oct  3 16:32 file3
-rw-r--r--  1 kk  kk  0 Oct  3 16:32 file4
-rw-r--r--  1 kk  kk  6 Oct  3 16:34 file5

$ find . -type f -exec cksum {} + | awk '!ck[$1$2]++ { print $3 }'
./file1
./file3

文件file1、file2和file4都是空的，但file3和file5有一些内容。该命令显示有两组文件：与相同的文件file1和与file3.

我们还可以准确地看到哪些文件是相同的：

$ find . -type f -exec cksum {} + | awk '{ ck[$1$2] = ck[$1$2] ? ck[$1$2] OFS $3 : $3 } END { for (i in ck) print ck[i] }'
./file3 ./file5
./file1 ./file2 ./file4

Question 3

给定目录 d 中的一组文件，以下是查找重复文件的 4 个代码的结果：

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.9 (jessie) 
bash GNU bash 4.3.30
fdupes 1.51
jdupes 1.5.1 (2016-11-01)
rdfind 1.3.4
duff 0.5.2

-----
 Files in directory d:
==> d/f1 <==
1

==> d/f11 <==
1

==> d/f2 <==
2

==> d/f20 <==
Now is the time
for all good men
to come to the aid
of their country.

==> d/f21 <==
Now is the time
for all good men
to come to the aid
of their country.

==> d/f22 <==
Now is the time
for all good men
to come to the aid
of their countryz

==> d/f3 <==
1


-----
 Results for fdupes:
d/f1                                    
d/f3
d/f11

d/f20
d/f21


-----
 Results for jdupes:
Examining 7 files, 1 dirs (in 1 specified)
d/f1                                                        
d/f3
d/f11

d/f20
d/f21

-----
 Results for rdfind:
Now scanning "d", found 7 files.
Now have 7 files in total.
Removed 0 files due to nonunique device and inode.
Now removing files with zero size from list...removed 0 files
Total size is 218 bytes or 218 b
Now sorting on size:removed 0 files due to unique sizes from list.7 files left.
Now eliminating candidates based on first bytes:removed 1 files from list.6 files left.
Now eliminating candidates based on last bytes:removed 1 files from list.5 files left.
Now eliminating candidates based on md5 checksum:removed 0 files from list.5 files left.
It seems like you have 5 files that are not unique
Totally, 74 b can be reduced.
Now making results file results.txt

-----
 Results for duff:
3 files in cluster 1 (2 bytes, digest e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e)
d/f1
d/f3
d/f11
2 files in cluster 2 (70 bytes, digest 7de790fbe559d66cf890671ea2ef706281a1017f)
d/f20
d/f21

最美好的祝愿...干杯，drl

Answer

给定目录 d 中的一组文件，以下是查找重复文件的 4 个代码的结果：

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.9 (jessie) 
bash GNU bash 4.3.30
fdupes 1.51
jdupes 1.5.1 (2016-11-01)
rdfind 1.3.4
duff 0.5.2

-----
 Files in directory d:
==> d/f1 <==
1

==> d/f11 <==
1

==> d/f2 <==
2

==> d/f20 <==
Now is the time
for all good men
to come to the aid
of their country.

==> d/f21 <==
Now is the time
for all good men
to come to the aid
of their country.

==> d/f22 <==
Now is the time
for all good men
to come to the aid
of their countryz

==> d/f3 <==
1


-----
 Results for fdupes:
d/f1                                    
d/f3
d/f11

d/f20
d/f21


-----
 Results for jdupes:
Examining 7 files, 1 dirs (in 1 specified)
d/f1                                                        
d/f3
d/f11

d/f20
d/f21

-----
 Results for rdfind:
Now scanning "d", found 7 files.
Now have 7 files in total.
Removed 0 files due to nonunique device and inode.
Now removing files with zero size from list...removed 0 files
Total size is 218 bytes or 218 b
Now sorting on size:removed 0 files due to unique sizes from list.7 files left.
Now eliminating candidates based on first bytes:removed 1 files from list.6 files left.
Now eliminating candidates based on last bytes:removed 1 files from list.5 files left.
Now eliminating candidates based on md5 checksum:removed 0 files from list.5 files left.
It seems like you have 5 files that are not unique
Totally, 74 b can be reduced.
Now making results file results.txt

-----
 Results for duff:
3 files in cluster 1 (2 bytes, digest e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e)
d/f1
d/f3
d/f11
2 files in cluster 2 (70 bytes, digest 7de790fbe559d66cf890671ea2ef706281a1017f)
d/f20
d/f21

最美好的祝愿...干杯，drl

Question 4

您也可以尝试 GUI 工具 meld。

meld dir1 dir2

或者

meld dir1 dir2 dir3

https://meldmerge.org/help/command-line.html

Answer

您也可以尝试 GUI 工具 meld。

meld dir1 dir2

或者

meld dir1 dir2 dir3

https://meldmerge.org/help/command-line.html

您可以比较一个目录中的所有文件吗？

答案1

答案2

答案3

答案4

相关内容