bash脚本中的两个文件比较?

bash脚本中的两个文件比较?

如何在shell脚本中查找两个文件匹配的数据并在shell中的另一个文件中查找重复的数据存储?

#!/bin/bash

file1="/home/vekomy/santhosh/bigfiles.txt"
file2="/home/vekomy/santhosh/bigfile2.txt"

while read -r $file1; do
    while read  -r $file2 ;do
        if [$file1==$file2] ;  then
            echo "two files are same"
        else
            echo "two files content different"
        fi
    done
done

我写了代码,但没有成功。怎么写呢?

答案1

要测试两个文件是否相同,请使用cmp -s

#!/bin/bash

file1="/home/vekomy/santhosh/bigfiles.txt"
file2="/home/vekomy/santhosh/bigfile2.txt"

if cmp -s "$file1" "$file2"; then
    printf 'The file "%s" is the same as "%s"\n' "$file1" "$file2"
else
    printf 'The file "%s" is different from "%s"\n' "$file1" "$file2"
fi

标志-stocmp将使该实用程序“静音”。当比较两个相同的文件时,退出状态cmp将为零。上面的代码中使用它来打印有关两个文件是否相同的消息。


如果你的两个输入文件包含路径名列表您想要比较的文件,然后使用双循环,如下所示:

#!/bin/bash

filelist1="/home/vekomy/santhosh/bigfiles.txt"
filelist2="/home/vekomy/santhosh/bigfile2.txt"

mapfile -t files1 <"$filelist1"

while IFS= read -r file2; do
    for file1 in "${files1[@]}"; do
        if cmp -s "$file1" "$file2"; then
            printf 'The file "%s" is the same as "%s"\n' "$file1" "$file2"
        fi
    done
done <"$filelist2" | tee file-comparison.out

在这里,结果同时在终端和文件中生成file-comparison.out

假设两个输入文件中的路径名不包含任何嵌入的换行符。

files1该代码首先使用 ,将其中一个文件中的所有路径名读取到数组 中mapfile。我这样做是为了避免多次读取该文件,因为我们必须遍历另一个文件中每个路径名的所有这些路径名。您会注意到,$filelist1我只是迭代数组中的名称,而不是从内部循环中读取files1

答案2

最简单的方法是使用命令diff

例子:

让我们假设第一个文件是file1.txt并且他包含:

I need to buy apples.
I need to run the laundry.
I need to wash the dog.
I need to get the car detailed.`

和第二个文件file2.txt

I need to buy apples.
I need to do the laundry.
I need to wash the car.
I need to get the dog detailed.

然后我们可以使用 diff 自动显示两个文件之间哪些行不同,命令如下:

diff file1.txt file2.txt

输出将是:

 2,4c2,4
 < I need to run the laundry.
 < I need to wash the dog.
 < I need to get the car detailed.
 ---
 > I need to do the laundry
 > I need to wash the car.
 > I need to get the dog detailed.

我们来看看这个输出意味着什么。要记住的重要一点是,当 diff 向您描述这些差异时,它是在规定的上下文中这样做的:它告诉您如何更改第一个文件以使其与第二个文件匹配。 diff 输出的第一行将包含:

  • 对应于第一个文件的行号,
  • 一个字母(a 表示添加,c 表示更改,d 表示删除)
  • 与第二个文件对应的行号。

在我们上面的输出中,“2,4c2,4”意思是:“线2通过4第一个文件中的内容需要更改为匹配行2通过4在第二个文件中。”然后它告诉我们每个文件中这些行的内容:

  • 前面有 < 的行是来自第一个文件的行;
  • > 前面的行是第二个文件中的行。
  • 三个破折号(“---”)仅分隔文件 1 和文件 2 的行。

来源

答案3

这是一个用于比较文件的纯 bash shell 脚本:

#!/usr/bin/env bash

# @(#) s1       Demonstrate rudimentary diff using shell only.

# Infrastructure details, environment, debug commands for forum posts.
# Uncomment export command to run as external user: not context, pass-fail.
# export PATH="/usr/local/bin:/usr/bin:/bin"
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f "$C" ] && $C
set -o nounset

FILE1=${1-data1}
shift
FILE2=${1-data2}

# Display samples of data files.
pl " Data files:"
head "$FILE1" "$FILE2"

# Set file descriptors.
exec 3<"$FILE1"
exec 4<"$FILE2"

# Code based on:
# http://www.linuxjournal.com/content/reading-multiple-files-bash

# Section 2, solution.
pl " Results:"

eof1=0
eof2=0
count1=0
count2=0
while [[ $eof1 -eq 0 || $eof2 -eq 0 ]]
do
  if read a <&3; then
    let count1++
    # printf "%s, line %d: %s\n" $FILE1 $count1 "$a"
  else
    eof1=1
  fi
  if read b <&4; then
    let count2++
    # printf "%s, line %d: %s\n" $FILE2 $count2 "$b"
  else
    eof2=1
  fi
  if [ "$a" != "$b" ]
  then
    echo " File $FILE1 and $FILE2 differ at lines $count1, $count2:"
    pe "$a"
    pe "$b"
    # exit 1
  fi
done

exit 0

生产:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.9 (jessie) 
bash GNU bash 4.3.30

-----
 Data files:
==> data1 <==
I need to buy apples.
I need to run the laundry.
I need to wash the dog.
I need to get the car detailed.

==> data2 <==
I need to buy apples.
I need to do the laundry.
I need to wash the car.
I need to get the dog detailed.

-----
 Results:
 File data1 and data2 differ at lines 2, 2:
I need to run the laundry.
I need to do the laundry.
 File data1 and data2 differ at lines 3, 3:
I need to wash the dog.
I need to wash the car.
 File data1 and data2 differ at lines 4, 4:
I need to get the car detailed.
I need to get the dog detailed.

如果您希望查看所读取的每一行,可以删除特定命令的注释,以便在看到第一个差异时退出。

参见页面http://www.linuxjournal.com/content/reading-multiple-files-bash有关文件描述符(例如“&3”)的详细信息。

最美好的祝愿...干杯,drl

相关内容