我有两个文件,A.tsv
并且B.tsv
:
A.tsv(字段分隔符 = \t
):
Sample ID Internal Control Result Consensus
4686427 Pass Not Detected Not Available
4666275 Pass Detected Not Available
4666295 Pass Detected Available
4644444 Pass Detected Available
B.tsv(字段分隔符 = \t
):
seqName clade substitutions deletions
4666295 A8A yes no
4666275 18A no yes
4686427 161 no yes
我想将这两个文件合并成一个新文件,如下所示:
Sample ID Internal Control Result Consensus clade substitutions deletions
4686427 Pass Not Detected Not Available 161 no yes
4666275 Pass Detected Not Available 18A no yes
4666295 Pass Detected Available A8A yes no
4644444 Pass Detected Available
我已经写了这个,但它不打印标题,也不打印第二个文件的第一行:
awk -F '\t' -v OFS="\t" 'NR==FNR{a[$1]=$0;next}{print $0,a[$1]}' B.tsv A.tsv > C.tsv
那么如何正确去做呢?谢谢
PS:我对文件进行了二次采样,真正的文件在行和列方面更大
答案1
尝试:
awk 'BEGIN { FS=OFS="\t" }
NR==FNR { seq=$1; sub(/[^\t]*\t/,""); if(NR==1)hdr=$0; hold[seq]=$0; next }
FNR==1 { print $0, hdr; next }
($1 in hold) { print $0, hold[$1]; next }
{ print }' fileB fileA >fileC
答案2
$ cat tst.awk
BEGIN { FS=OFS="\t" }
{ key = (FNR>1 ? $1 : RS) }
NR == FNR {
$1 = ""
map[key] = $0
next
}
{ print $0 map[key] }
$ awk -f tst.awk B.tsv A.tsv
Sample ID Internal Control Result Consensus clade substitutions deletions
4686427 Pass Not Detected Not Available 161 no yes
4666275 Pass Detected Not Available 18A no yes
4666295 Pass Detected Available A8A yes no
4644444 Pass Detected Available
$ awk -f tst.awk B.tsv A.tsv | column -s$'\t' -t
Sample ID Internal Control Result Consensus clade substitutions deletions
4686427 Pass Not Detected Not Available 161 no yes
4666275 Pass Detected Not Available 18A no yes
4666295 Pass Detected Available A8A yes no
4644444 Pass Detected Available