使用 shell 脚本比较 2 个不同文件中的一对一行

Question 1

也许不是很漂亮，但这样的事情可能是一个开始：

# 1. Read lines from file1 as string, and file2 as comma-separated array.
while read -r a && IFS=, read -ra b <&3; do
    # 2. If both empty lines, continue.
    if [[ "$a" == "" && ${#b[@]} == 0 ]]; then
        continue
    fi
    # 3. Start assuming diff.
    diff=1
    # 4. Loop fields in $b.
    for e in ${b[@]}; do
        # Compare field in $b with $a, if match then abort.
        if [[ "$e" == "$a" ]]; then
            diff=0
            break
        fi
    done
    # 5. If no match found, print line from $b.
    if [[ $diff == 1 ]]; then
        # Join array with <space>comma.
        line=$(printf ", %s" "${b[@]}")
        # Print line, excluding leading <space>comma.
        printf "%s\n" "${line:2}"
    fi
# Input argument one as file 1 to stdin, and argument two as file 2 to
# file descriptor 3.
done < "$1" 3<"$2"

通常用作：

$ ./myscript file1 file2

现在使用 Python、Perl、awk 等可能会更好。

Answer

也许不是很漂亮，但这样的事情可能是一个开始：

# 1. Read lines from file1 as string, and file2 as comma-separated array.
while read -r a && IFS=, read -ra b <&3; do
    # 2. If both empty lines, continue.
    if [[ "$a" == "" && ${#b[@]} == 0 ]]; then
        continue
    fi
    # 3. Start assuming diff.
    diff=1
    # 4. Loop fields in $b.
    for e in ${b[@]}; do
        # Compare field in $b with $a, if match then abort.
        if [[ "$e" == "$a" ]]; then
            diff=0
            break
        fi
    done
    # 5. If no match found, print line from $b.
    if [[ $diff == 1 ]]; then
        # Join array with <space>comma.
        line=$(printf ", %s" "${b[@]}")
        # Print line, excluding leading <space>comma.
        printf "%s\n" "${line:2}"
    fi
# Input argument one as file 1 to stdin, and argument two as file 2 to
# file descriptor 3.
done < "$1" 3<"$2"

通常用作：

$ ./myscript file1 file2

现在使用 Python、Perl、awk 等可能会更好。

Question 2

也许这个堆栈溢出的答案会让你走向正确的方向：

最有可能的是你想把每个文件的每一行放在一个循环列表或者大批，使用第一个建议。然后同时迭代它们并使用第二个建议比较字符串。

Answer

也许这个堆栈溢出的答案会让你走向正确的方向：

最有可能的是你想把每个文件的每一行放在一个循环列表或者大批，使用第一个建议。然后同时迭代它们并使用第二个建议比较字符串。

Question 3

尝试：

paste file1 file2 | grep -vP '^(.*)\t.*\1.*'

并可能根据您的情况调整正则表达式。

Answer

尝试：

paste file1 file2 | grep -vP '^(.*)\t.*\1.*'

并可能根据您的情况调整正则表达式。

Question 4

使用 GNU awk，您可以一行完成：

awk '{a=$0;getline <File2;if($0 ~ a)print "OK"; else print a,$0}' File1

Answer

使用 GNU awk，您可以一行完成：

awk '{a=$0;getline <File2;if($0 ~ a)print "OK"; else print a,$0}' File1

使用 shell 脚本比较 2 个不同文件中的一对一行

答案1

答案2

答案3

答案4

相关内容