如何在awk中从多个文件打印模板中的数据

如何在awk中从多个文件打印模板中的数据

我有一个主文件A.txt(字段分隔符 = \t):

Sample ID   Internal Control    Result  Consensus Sequence  Lane    Index Set   Index ID
2154686427  Pass    Detected    Not Available   1,2,3,4 1   UDP0001
2154666275  Pass    Detected    Not Available   1,2,3,4 1   UDP0002

每个样本都有一个文件,其中包含相同的指标,例如此处2154686427.mapping_metrics.csv2154666275.mapping_metrics.csv(字段分隔符 = ,)。 2154686427.mapping_metrics.csv:

MAPPING/ALIGNING SUMMARY,,Total input reads,5654101,100.00
MAPPING/ALIGNING SUMMARY,,Number of duplicate marked reads,5577937,98.65

2154666275.mapping_metrics.csv

MAPPING/ALIGNING SUMMARY,,Total input reads,5651111,100.00
MAPPING/ALIGNING SUMMARY,,Number of duplicate marked reads,5511111,97.2

我想打印 中每个文件的标题 ($3) 和相应的值 ($4) A.txt,如下所示:

Sample ID   Internal Control    Result  Consensus Sequence  Lane    Index Set   Index ID    Total input reads   Number of duplicate marked reads
2154686427  Pass    Detected    Not Available   1,2,3,4 1   UDP0001 5654101 5577937
2154666275  Pass    Detected    Not Available   1,2,3,4 1   UDP0002 561111  5511111

您有这样做的想法吗?

我尝试根据文件名相似性搜索类似的问题,但没有找到。谢谢

答案1

awk -v OFS="\t" -F, '
  FS==","{
     hdr[FNR]=$3                         # save header in array
     sub(/\..*/, "", FILENAME)           # remove `.mapping_metrics.csv` from FILENAME
     sub(/.*\//, "", FILENAME)           # remove parent path from FILENAME
     val[FILENAME]=val[FILENAME] OFS $4  # append value to array using tab as separator
     next
  }
  FNR==1{
    print $0 OFS hdr[1] OFS hdr[2]       # print header and new header fields
    next
  }
  { print $0 val[$1] }                   # print record with new values
' *.mapping_metrics.csv FS="\t" A.txt

答案2

awk -F '\t' '
  BEGIN { OFS = FS; ORS = "" }
  NR==1 {
    h1 = "Total input reads"
    h2 = "Number of duplicate marked reads"
    print $0, h1, h2 RS
    next
  }
  {
    print
    FS = ","
    f = $1 ".mapping_metrics.csv"
    while (getline < f > 0)
      if ((h1==$3)||(h2==$3))
        print "", $4
    print RS
    close(f)
    FS = OFS
  }
' ./A.txt

假设标头的最后两个字段可以被硬编码。

相关内容