在 awk 代码中使用 bash 变量

Question

在 awk 中，将视为$用于获取给定字段编号的值的运算符。Awk 变量与 C 变量一样，不需要使用取消引用$。

awk -F', *'\
    -v OFS=', ' \
    -v the_highest_POP="the_highest_POP" \
    -v the_lowest_dG="$the_lowest_dG" \
'
    # ...
    dG = sqrt((($3 - the_lowest_dG) / the_lowest_dG)^2 + (($2 - the_highest_POP) / the_highest_POP)^2)

我是否应该使用 bash（使用 while 一次处理所有 CSV）计算 ${the_highest_POP} 和 ${the_lowest_dG} ，将它们存储在外部变量中并提供给 awk，[...] 或者是否可以直接在 AWK 代码中执行所有步骤

这完全取决于some_method_to_cumpute_highest_POP_for_all_csvs_in_d

啊，我明白了。我没有仔细阅读问题。您将必须处理文件两次，一次用于查找最小/最大值，另一次用于进行 dG 计算。为了便于阅读，我将使用 2 个单独的 find/awk 调用来执行此操作：

# get the files you want to operate on.
# I'm assuming there are not hundreds of them.
mapfile -t cvsFiles < <(
    find . -maxdepth 1 -type d -name '*_*_*' |
    awk -F '[_/]' '!seen[$2]++ {print $2}')
)

# get the min/max values
read highestPOP lowestDG < <(
    awk -F ', ' '
        FNR == 1 {next}
        NR == 2 || $2 > pop {pop = $2}
        NR == 2 || $3 < dg  {dg  = $3}
        END {print pop, dg}
    ' "${csvFiles[@]}"
)

# do the calculations over all the files
awk -F', *'\
    -v OFS=', ' \
    -v the_highest_POP="$highestPOP" \
    -v the_lowest_dG="$lowestDG" \
'...' "${files[@]}"

Answer 1

在 awk 中，将视为$用于获取给定字段编号的值的运算符。Awk 变量与 C 变量一样，不需要使用取消引用$。

awk -F', *'\
    -v OFS=', ' \
    -v the_highest_POP="the_highest_POP" \
    -v the_lowest_dG="$the_lowest_dG" \
'
    # ...
    dG = sqrt((($3 - the_lowest_dG) / the_lowest_dG)^2 + (($2 - the_highest_POP) / the_highest_POP)^2)

我是否应该使用 bash（使用 while 一次处理所有 CSV）计算 ${the_highest_POP} 和 ${the_lowest_dG} ，将它们存储在外部变量中并提供给 awk，[...] 或者是否可以直接在 AWK 代码中执行所有步骤

这完全取决于some_method_to_cumpute_highest_POP_for_all_csvs_in_d

啊，我明白了。我没有仔细阅读问题。您将必须处理文件两次，一次用于查找最小/最大值，另一次用于进行 dG 计算。为了便于阅读，我将使用 2 个单独的 find/awk 调用来执行此操作：

# get the files you want to operate on.
# I'm assuming there are not hundreds of them.
mapfile -t cvsFiles < <(
    find . -maxdepth 1 -type d -name '*_*_*' |
    awk -F '[_/]' '!seen[$2]++ {print $2}')
)

# get the min/max values
read highestPOP lowestDG < <(
    awk -F ', ' '
        FNR == 1 {next}
        NR == 2 || $2 > pop {pop = $2}
        NR == 2 || $3 < dg  {dg  = $3}
        END {print pop, dg}
    ' "${csvFiles[@]}"
)

# do the calculations over all the files
awk -F', *'\
    -v OFS=', ' \
    -v the_highest_POP="$highestPOP" \
    -v the_lowest_dG="$lowestDG" \
'...' "${files[@]}"

在 awk 代码中使用 bash 变量

答案1

相关内容