比较两个文本文件

Question 1

尝试这个命令：

 grep -v -f file2.csv file1.csv > file3.csv

根据grep 手册：

  -f FILE, --file=FILE
          Obtain  patterns  from  FILE,  one  per  line.   The  empty file
          contains zero patterns, and therefore matches nothing.   (-f  is
          specified by POSIX.)

  -v, --invert-match
          Invert the sense of matching, to select non-matching lines.  (-v
          is specified by POSIX.)

正如 Steeldriver 在他的评论中所说，最好还添加-x以下-F内容：

  -F, --fixed-strings
          Interpret PATTERN as a  list  of  fixed  strings,  separated  by
          newlines,  any  of  which is to be matched.  (-F is specified by
          POSIX.)
  -x, --line-regexp
          Select  only  those  matches  that exactly match the whole line.
          (-x is specified by POSIX.)

因此，更好的命令是：

 grep -xvFf file2.csv file1.csv > file3.csv

该命令使用file2.csv行作为模式并打印file1.csv不匹配的行（-v）。

Answer

尝试这个命令：

 grep -v -f file2.csv file1.csv > file3.csv

根据grep 手册：

  -f FILE, --file=FILE
          Obtain  patterns  from  FILE,  one  per  line.   The  empty file
          contains zero patterns, and therefore matches nothing.   (-f  is
          specified by POSIX.)

  -v, --invert-match
          Invert the sense of matching, to select non-matching lines.  (-v
          is specified by POSIX.)

正如 Steeldriver 在他的评论中所说，最好还添加-x以下-F内容：

  -F, --fixed-strings
          Interpret PATTERN as a  list  of  fixed  strings,  separated  by
          newlines,  any  of  which is to be matched.  (-F is specified by
          POSIX.)
  -x, --line-regexp
          Select  only  those  matches  that exactly match the whole line.
          (-x is specified by POSIX.)

因此，更好的命令是：

 grep -xvFf file2.csv file1.csv > file3.csv

该命令使用file2.csv行作为模式并打印file1.csv不匹配的行（-v）。

Question 2

为了能够使用comm，您必须先对行进行排序。

comm -23 <(sort file1.csv) <(sort file2.csv) > file3.csv

Answer

为了能够使用comm，您必须先对行进行排序。

comm -23 <(sort file1.csv) <(sort file2.csv) > file3.csv

Question 3

一个 Python 选项：

#!/usr/bin/env python3

import sys

def readfile(file):
    with open(file) as src:
        return [line.strip() for line in src.readlines()]

lines_1 = readfile(sys.argv[1]); lines_2 = readfile(sys.argv[2])

for line in lines_1:
    if not line in lines_2:
        print(line)

输出：

1,4,5,6
1,11,13,17

将脚本粘贴到空文件中extract.py，使其可执行并通过以下命令运行它：

<script> <file_1> <file_2>

或者直接写入file_3：

<script> <file_1> <file_2> >file_3

Answer

一个 Python 选项：

#!/usr/bin/env python3

import sys

def readfile(file):
    with open(file) as src:
        return [line.strip() for line in src.readlines()]

lines_1 = readfile(sys.argv[1]); lines_2 = readfile(sys.argv[2])

for line in lines_1:
    if not line in lines_2:
        print(line)

输出：

1,4,5,6
1,11,13,17

将脚本粘贴到空文件中extract.py，使其可执行并通过以下命令运行它：

<script> <file_1> <file_2>

或者直接写入file_3：

<script> <file_1> <file_2> >file_3

Question 4

使用diff命令执行grep，无需存储。

如果文件 1 中存在行，但文件 2 中不存在行，则输出：

$ diff file{1,2}.csv | grep -Po "^< \K.*"
1,4,5,6
1,11,13,17

如果文件 2 中存在行，而文件 1 中不存在行，则输出此信息，只需将左角 ( <) 更改为右角 ( >)：

$ diff file{1,2}.csv | grep -Po "^> \K.*"
2,4,9,10
13,14,17,18

Answer

使用diff命令执行grep，无需存储。

如果文件 1 中存在行，但文件 2 中不存在行，则输出：

$ diff file{1,2}.csv | grep -Po "^< \K.*"
1,4,5,6
1,11,13,17

如果文件 2 中存在行，而文件 1 中不存在行，则输出此信息，只需将左角 ( <) 更改为右角 ( >)：

$ diff file{1,2}.csv | grep -Po "^> \K.*"
2,4,9,10
13,14,17,18

比较两个文本文件

答案1

答案2

答案3

答案4

相关内容