如何解析然后粘贴到awk中的同一行

如何解析然后粘贴到awk中的同一行

我有一个像这样的文件A.txt(字段分隔符= ,):

Kit Batch Export
Software Version = NO_v1
Date And Time of Export =
Experiment Name =
Instrument Software Version =
Instrument Type = Cji
Instrument Serial Number =
Run Start Date =
Run End Date =
Run Operator =
Batch Status = VALID
Method = Nov
Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis,EC,CH
,novaprime-ct044032-TB_2034,2061571293,A01,Unkn-01,VALID,,,
,novaprime-ct044032-TB_2034,2061584371,A02,Unkn-09,VALID,,,

并且B.csv(字段分隔符= \t;第一列为空):

    Well    Fluor   Target  Content Sample  Cq  SQ
    A01 Cy5 EC  Unkn-01 2060563935  26  NaN
    A02 Cy5 CH  Unkn-09 2060565055  37  NaN
    A01 Cy5 CH  Unkn-01 2060565888  54  NaN
    A02 Cy5 EC  Unkn-09 2060565465  NaN NaN

B.txt我想在相应的行/列中添加 Well/Target 的每一行的值(Cq 列)(此处示例:A01/EC;A01/CH;A02/EC;A02/CH),A.txt如下所示:

Kit Batch Export
Software Version = NO_v1
Date And Time of Export =
Experiment Name =
Instrument Software Version =
Instrument Type = Cji
Instrument Serial Number =
Run Start Date =
Run End Date =
Run Operator =
Batch Status = VALID
Method = Nov
Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis,EC,CH
,novaprime-ct044032-TB_2034,2061571293,A01,Unkn-01,VALID,,,,26,54
,novaprime-ct044032-TB_2034,2061584371,A02,Unkn-09,VALID,,,,NaN,37

为此,我尝试这样做:

awk -F"\t" 'FNR==NR{if (a[$2]) {a[$2]=a[$2] "," $7} else {a[$2]=$7}} NR>FNR{split($0,f,","); if (a[f[4]]) $0=$0 "," a[f[4]]; print}' B.txt A.txt > C.txt

它有点工作,但它在遇到第一次迭代时粘贴该值,而不是在它识别它是 EC 还是 CH 时。那么你有不同的方法来正确地做到这一点吗?谢谢

答案1

只要“标题”行中不能出现逗号,以下内容就可以工作:

awk -F'\t' 'FNR==NR{if ($4=="EC") ec[$2]=$7; else if ($4=="CH") ch[$2]=$7; next}
            NR>FNR&&NF>1 {if (!f) f=1; else {$10=ec[$4]; $11=ch[$4];}}1' B.txt FS=',' OFS=',' A.txt

这将首先解析B.txt并创建一个“EC-to-Well”映射和一个“CH-to-Well”映射,然后在解析时使用A.txt。我们将字段分隔符设置为,forA.txt并确保只处理具有多个字段(即至少一个,)的行,但不处理包含列标题的第一个字段。

更新

由于您在评论中指出有时B.txt可能包含空字段,您希望确保将其替换为NaN,因此我们需要进行额外检查:

awk -F'\t' 'FNR==NR{if ($4=="EC") ec[$2]=$7; else if ($4=="CH") ch[$2]=$7; next}
            NR>FNR&&NF>1 {if (!f) f=1; else {$10=ec[$4]?ec[$4]:"NaN"; $11=ch[$4]?ch[$4]:"NaN";}}1' B.txt FS=',' OFS=',' A.txt

这是非常“高尔夫”的,但基本上

$10=ec[$4] ? ec[$4] : "NaN"

方法

if (ec[$4]) $10=ec[$4]; else $10="NaN"

相关内容