我想删除重复的条目e从file1
如果e也存在于file2
.
输入file1
:
x1 y1
x2 y2
x3 y3
x4 y4
y1 x1
x5 y5
y3 x3
x6 y6
x5 y5
输入file2
:
y1 x1
y2 x2
y3 x3
y4 x4
x1 y1
y5 x5
x3 y3
y6 x6
x5 y5
期望的输出:
x1 y1
x2 y2
x3 y3
x4 y4
x5 y5
x6 y6
我使用了以下 shell 脚本:
awk 'FNR==NR {
lines[NR,"col1"] = $1
lines[NR,"col2"] = $2
lines[NR,"line"] = $0
next
}
(lines[FNR,"col1"] != $1) {($1 in lines)
print lines[FNR,"line"]
next
}' file1.txt file2.txt
但它给出以下输出:
x1 y1
x2 y2
x3 y3
x4 y4
y1 x1
x5 y5
y3 x3
x6 y6
答案1
首先:您想要的输出应该是:
y2 x2
y4 x4
y5 x5
y6 x6
因为两个文件中都存在“x3 y3”和“x1 y1”
要获取 file1 中存在的行,您只需执行以下操作即可
grep -v -f file1 file2
来自手册页
-v
--invert-match
Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX.)
-f file
--file=file
Obtain patterns from file, one per line. The empty file contains zero patterns, and therefore matches nothing. (-f is specified by POSIX.)
答案2
尝试这个:
awk '{if($1>$2) print $2 " " $1; else print $0;}' file1.txt file2.txt | sort -u > out.txt
这将输出:
x1 y1
x2 y2
x3 y3
x4 y4
x5 y5
x6 y6
awk
只需按字母顺序对列重新排序,sort -u
(唯一)删除重复的行。