我有两个文本文件
猫A.txt
10,1,1,"ABC"
10,1,2,"S1"
10,1,2,"ABC"
10,1,3,"baba"
10,2,1,"S2"
10,2,1,"asd"
10,2,2,"S3"
10,2,2,"dkkd"
10,2,3,"ABC"
猫B.txt
10,1,1,"ABC1"
10,1,2,"S1"
10,1,2,"ABC"
10,1,3,"baba"
10,2,1,"asd"
10,2,2,"S3"
10,2,2,"dkkd"
10,2,4,"bokaj"
我想通过读取两个文本文件来找到丢失的字段,并通过“”填写丢失字段的两个文件,然后保存到两个新的修改文件我如何得到这个说法
A1.txt是A.txt的修改版本
猫A1.txt
10,1,1,"ABC"
10,1,2,"S1"
10,1,2,"ABC"
10,1,3,"baba"
10,2,1,"S2"
10,2,1,"asd"
10,2,2,"S3"
10,2,2,"dkkd"
10,2,3,"ABC"
10,2,4," "
B1.txt是B.txt的修改版本
猫B1.txt
10,1,1,"ABC1"
10,1,2,"S1"
10,1,2,"ABC"
10,1,3,"baba"
10,2,1," "
10,2,1,"asd"
10,2,2,"S3"
10,2,2,"dkkd"
10,2,3," "
10,2,4,"bokaj"
确保 A1.txt 中的总行数与 B1.txt 的行数相同,我是 bash 新手,您的回答和解释可能会帮助我了解很多。
这是我迄今为止尝试过的 MWE
#!/bin/bash
cut -d ',' -f1,2,3 A.txt > A1.txt
cut -d ',' -f1,2,3 B.txt > B1.txt
## Command to print contents which are in B1.txt but not in A1.txt
A=`awk 'NR==FNR{a[$0];next} !($0 in a)' A1.txt B1.txt`
echo $A,'" "' >> A.txt
sort A.txt
## Command to print contents which are in A1.txt but not in B1.txt
B=`awk 'NR==FNR{a[$0];next} !($0 in a)' B1.txt A1.txt`
echo $B,'" "' >> B.txt
sort B.txt
答案1
也许diff
那时sort
可以在这里派上用场。
对于A.txt
和B.txt
文件及其各自的同伴A1.txt
和B1.txt
文件已经按照您的示例设置,请执行以下操作:
diff --unchanged-line-format= --old-line-format= --new-line-format='%l," "'%c\'\\12\' A1.txt B1.txt | sort -st , -k 1,3 A.txt -
和:
diff --unchanged-line-format= --old-line-format= --new-line-format='%l," "'%c\'\\12\' B1.txt A1.txt | sort -st , -k 1,3 B.txt -
这些应该会产生您所描述的输出。
答案2
grep -vFf B.txt A.txt | sed 's/".*"/" "/' | sort -st, -k1,3 - B.txt
结果(B1.txt):
10,1,1," "
10,1,1,"ABC1"
10,1,2,"S1"
10,1,2,"ABC"
10,1,3,"baba"
10,2,1," "
10,2,1,"asd"
10,2,2,"S3"
10,2,2,"dkkd"
10,2,3," "
10,2,4,"bokaj"
第一行与您的示例不同,但我认为它应该在那里,因为ABC
与ABC1
.