答案1
这将打印预期的结果
cut -f 1-3,5 file.tsv | sort -u | cut -f 4 | sort | uniq -c | awk '{ print $2, $1; }'
解释:
cut -f 1-3,5 file.tsv
提取相关列 1、2、3、5
sort -u
获取唯一组合
cut -f 4
仅提取原始第 5 列值,该值现在位于第 4 列中 对
sort | uniq -c
唯一值进行排序和计数
awk '{ print $2 "\t" $1; }'
交换列并格式化输出
答案2
看起来你想要类似的东西
awk '{test[$5" "$1" "$2" "$3]++}END{for (t in test) print t}' file1 | cut -d' ' -f1 | sort | uniq -c
走过
test[$5" "$1" "$2" "$3]++ #populates an array with unique combinations of these fields
for (t in test) print t #print each unique array index (field combination) once to STDOUT
cut -d' ' -f1 #extract what was the original 5th field
sort #yes, yes OK @Bodo
uniq -c #count the number of times it appears
输出
2 abc
1 def
编辑
虽然承认败在@Bodo手中,但寻找可行awk
解决方案的决心仍然存在,所以我提供了这个丑陋的野兽......
awk '!test[$5" "$1" "$2" "$3]{out[$5]++;test[$5" "$1" "$2" "$3]++}
END
{for (o in out) print o, out[o]}' file1