如何将两个文件中的数据与awk中的标头合并

如何将两个文件中的数据与awk中的标头合并

我有两个文件,A.tsv并且B.tsv

A.tsv(字段分隔符 = \t):

Sample ID   Internal Control    Result  Consensus   
4686427 Pass    Not Detected    Not Available
4666275 Pass    Detected    Not Available
4666295 Pass    Detected    Available
4644444 Pass    Detected    Available

B.tsv(字段分隔符 = \t):

seqName clade   substitutions   deletions
4666295 A8A yes no
4666275 18A no  yes
4686427 161 no  yes

我想将这两个文件合并成一个新文件,如下所示:

Sample ID   Internal Control    Result  Consensus   clade   substitutions   deletions   
4686427 Pass    Not Detected    Not Available   161 no  yes
4666275 Pass    Detected    Not Available   18A no  yes
4666295 Pass    Detected    Available   A8A yes no
4644444 Pass    Detected    Available

我已经写了这个,但它不打印标题,也不打印第二个文件的第一行:

awk -F '\t' -v OFS="\t" 'NR==FNR{a[$1]=$0;next}{print $0,a[$1]}' B.tsv A.tsv > C.tsv

那么如何正确去做呢?谢谢

PS:我对文件进行了二次采样,真正的文件在行和列方面更大

答案1

尝试:

awk 'BEGIN        { FS=OFS="\t" }
     NR==FNR      { seq=$1; sub(/[^\t]*\t/,""); if(NR==1)hdr=$0; hold[seq]=$0; next }
     FNR==1       { print $0, hdr; next }
     ($1 in hold) { print $0, hold[$1]; next }
                  { print }' fileB fileA >fileC

答案2

$ cat tst.awk
BEGIN { FS=OFS="\t" }
{ key = (FNR>1 ? $1 : RS) }
NR == FNR {
    $1 = ""
    map[key] = $0
    next
}
{ print $0 map[key] }

$ awk -f tst.awk B.tsv A.tsv
Sample  ID      Internal Control        Result Consensus        clade   substitutions   deletions
4686427 Pass    Not Detected    Not Available   161     no      yes
4666275 Pass    Detected        Not Available   18A     no      yes
4666295 Pass    Detected        Available       A8A     yes     no
4644444 Pass    Detected        Available

$ awk -f tst.awk B.tsv A.tsv | column -s$'\t' -t
Sample   ID    Internal Control  Result Consensus  clade  substitutions  deletions
4686427  Pass  Not Detected      Not Available     161    no             yes
4666275  Pass  Detected          Not Available     18A    no             yes
4666295  Pass  Detected          Available         A8A    yes            no
4644444  Pass  Detected          Available

相关内容