2 个文件之间的算术生成一系列新文件(第 2 部分)

2 个文件之间的算术生成一系列新文件(第 2 部分)

这是对上一个问题的更具体的跟进(2 个文件之间的算术生成一系列新文件)。

我有一个制表符分隔的模型输入文件,我想针对类似于此的集成分析格式进行更改

cat input.txt

/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.5 1
abies_grandis 2.5 0.4 1
larix_occidentalis 1.5 0.3 1

我有另一个从分布中随机选择的乘数文件,每行一个,如下所示

cat multipliers.txt

0.1
0.5
0.25

我想生成一系列新的输入文件,其中一个字段(wsg)乘以第二个文件中的单个乘数。在此示例中,将有 3 个新文件对应 3 个乘数(实际分析将涉及 1000 个乘数)。输出文件如下所示:

(wsg * 0.1)cat file1.txt

/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.05 1
abies_grandis 2.5 0.04 1
larix_occidentalis 1.5 0.03 1

(wsg * 0.5)cat file2.txt

/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.25 1
abies_grandis 2.5 0.2 1
larix_occidentalis 1.5 0.15 1

(wsg * 0.25)cat file3.txt

/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.125 1
abies_grandis 2.5 0.1 1
larix_occidentalis 1.5 0.075 1

根据我之前的问题,@EdMorton 建议如下:

$ cat tst.awk
NR==FNR {
    if ( pastHdr ) {
        ++numLines
        wsg[numLines] = $NF
        sub(/[[:space:]][^[:space:]]+$/,"")
        rest[numLines] = $0
    }
    else {
        hdr = hdr $0 ORS
        if ( $1 == "***" ) {
            pastHdr = 1
        }
    }
    next
}
{
    out = "file" FNR ".txt"
    printf "%s", hdr > out
    for (lineNr=1; lineNr<=numLines; lineNr++) {
        print rest[lineNr], wsg[lineNr] * $0 > out
    }
    close(out)
}
$ awk -f tst.awk input.txt multipliers.txt

这是对我之前的问题的一个很好的解决方案,但是算术是在输入中每行的最后一个字段上进行的。我想修改它以处理每行中的第 n 个字段,在本例中为第三个 (wsg)

答案1

刚刚重新阅读您的问题后,我实际上建议您执行以下操作,这不依赖于您告诉它哪个字段是该字段,而是从输入文件中wsg以开头的行读取该信息:***

$ cat tst.awk
BEGIN { FS=OFS="\t" }
NR==FNR {
    if ( tgtFldNr ) {
        lines[++numLines] = $0
    }
    else {
        hdr = hdr $0 ORS
        if ( /^\*\*\*/ ) {      # in case this line is not tab-separated
            split($0,f," ")
            for (i in f) {
                if ( f[i] == "wsg" ) {
                    tgtFldNr = i-1
                    break
                }
            }
        }
    }
    next
}
{
    mult = $1
    out = "file" FNR ".txt"
    printf "%s", hdr > out
    for (lineNr=1; lineNr<=numLines; lineNr++) {
        $0 = lines[lineNr]
        $tgtFldNr *= mult
        print > out
    }
    close(out)
}

$ awk -f tst.awk input.txt multipliers.txt

$ head file*
==> file1.txt <==
/* Preciptation in mm */
10      30      40      50      23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa        2       0.05    1
abies_grandis   2.5     0.04    1
larix_occidentalis      1.5     0.03    1

==> file2.txt <==
/* Preciptation in mm */
10      30      40      50      23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa        2       0.25    1
abies_grandis   2.5     0.2     1
larix_occidentalis      1.5     0.15    1

==> file3.txt <==
/* Preciptation in mm */
10      30      40      50      23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa        2       0.125   1
abies_grandis   2.5     0.1     1
larix_occidentalis      1.5     0.075   1

后代的原始答案:

$ cat tst.awk
NR==FNR {
    if ( pastHdr ) {
        lines[++numLines] = $0
    }
    else {
        hdr = hdr $0 ORS
        if ( $1 == "***" ) {
            pastHdr = 1
        }
    }
    next
}
{
    tgtFldNr = (n ? n : NF)
    mult = $1
    out = "file" FNR ".txt"
    printf "%s", hdr > out
    for (lineNr=1; lineNr<=numLines; lineNr++) {
        $0 = lines[lineNr]
        $tgtFldNr *= mult
        print > out
    }
    close(out)
}

$ awk -v n=3 -f tst.awk input.txt multipliers.txt

$ head file*
==> file1.txt <==
/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.05 1
abies_grandis 2.5 0.04 1
larix_occidentalis 1.5 0.03 1

==> file2.txt <==
/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.25 1
abies_grandis 2.5 0.2 1
larix_occidentalis 1.5 0.15 1

==> file3.txt <==
/* Preciptation in mm */
10 30 40 50 23

### Species description
*** sp_name LMA wsg a_h
abies_lasiocarpa 2 0.125 1
abies_grandis 2.5 0.1 1
larix_occidentalis 1.5 0.075 1

如果您没有设置-v n=<number>要相乘的字段数,那么它将默认将每行的最后一个字段相乘,就像您在上一个问题中想要的那样。

您在问题的文本中说您的输入是制表符分隔的,但在您提供的示例中看起来并非如此。如果它确实是制表符分隔的,那么只需添加到BEGIN { FS=OFS="\t" }脚本的顶部,紧邻该NR==FNR {行之前。

相关内容