使用 awk 将数据从文件打印到模板

使用 awk 将数据从文件打印到模板

我有一个包含一些信息的文件( A.txt ; sep="\t" ;第一列是“\t”):

    Well    Fluor   Target  Content Sample  Cq  SQ
    A01 Cy5 EC  Unkn-01 205920777.1 25.714557922167 NaN
    A01 FAM Covid   Unkn-01 205920777.1 21.6541150578409    NaN
    A02 Cy5 EC  Unkn-09 neg5    25.5068289526473    NaN
    A02 FAM Covid   Unkn-09 neg5    NaN NaN 
    A07 Cy5 EC  Unkn-49     NaN NaN
    A07 FAM Covid   Unkn-49     NaN NaN

我有一个模板 (B.txt;sep=",") :

kit
Software Version =
Date And Time of Export =
Experiment Name =
Instrument Software Version =
Instrument Type = CFX
Instrument Serial Number =
Run Start Date =
Run End Date =
Run Operator =
Batch Status = VALID
Method = Novaprime
Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis
,,,,,,,,,,
*reporting.

我想使用模板 B.txt 将 A.txt 的信息放入 C.txt 中。 C.txt:

kit
Software Version =
Date And Time of Export =
Experiment Name =
Instrument Software Version =
Instrument Type = CFX
Instrument Serial Number =
Run Start Date =
Run End Date =
Run Operator =
Batch Status = VALID
Method = Novaprime
Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis
,,205920777.1,A01,Unkn-01
,,neg5,A02,Unkn-09
,,,,,,,,,,
*reporting.

诀窍是仅打印 A.txt 中第 5 列不为空的行。我尝试过以下操作:

awk 'NR==FNR{a[$5]=$1;next}{print $1,$2,a[$1]} ' A.txt B.txt > C.txt

但它不起作用,因为 B.txt 没有类似的密钥。而且分隔符的差异也是一个问题。有人可以有一个想法吗?

谢谢

答案1

假设文件的第一列是空的,如您所说,您需要将所有内容向左移动。当您谈论第 5 个字段时,实际上是第 6 个字段。无论如何,我能想到的最简单的方法是首先修改您的A.txt文件,使其具有您可以使用的格式:

$ awk -F'\t' -v OFS="," '(NR>1 && $6!="NaN"){print ",",$6,$2,$5}' A.txt  | sort | uniq 
,,205920777.1,A01,Unkn-01
,,neg5,A02,Unkn-09

这应该会给你想要插入到你的C.txt.因此,要添加它们,您可以做一些不优雅的事情,如下所示:

( head -n 13 B.txt 
  awk -F'\t' -v OFS="," '(NR>1 && $6!="NaN"){print ",",$6,$2,$5}' A.txt | sort | uniq
  tail -n+14 B.txt ) > C.txt

其产生:

$ cat C.txt
kit
Software Version =
Date And Time of Export =
Experiment Name =
Instrument Software Version =
Instrument Type = CFX
Instrument Serial Number =
Run Start Date =
Run End Date =
Run Operator =
Batch Status = VALID
Method = Novaprime
Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis
,,205920777.1,A01,Unkn-01
,,neg5,A02,Unkn-09
,,,,,,,,,,
*reporting.
    

相关内容