如何将指定要打印的列的字符串传递给 awk?

如何将指定要打印的列的字符串传递给 awk?

我有一个包含大量空格分隔列的文件。我想以动态方式根据某些数字标准打印特定列。例如:

]$ cols=$(for i in `seq 1 3`; do echo -n "\$$[$[i-1]*6+1],\$$[$[i-1]*6+2],\$$[$[i-1]*6+3],\$$[$[i-1]*6+4+66],\$$[$[i-1]*6+5+66],\$$[$[i-1]*6+6+66],"; done)

这给了我想要打印的列:

]$ echo ${cols%?}
$1,$2,$3,$70,$71,$72,$7,$8,$9,$76,$77,$78,$13,$14,$15,$82,$83,$84

当我将其作为字符串传递给 awk 时,我没有得到我想要的:

]$ awk -v cols=${cols%?} '{print cols}' file-testawk | head -2
$1,$2,$3,$70,$71,$72,$7,$8,$9,$76,$77,$78,$13,$14,$15,$82,$83,$84
$1,$2,$3,$70,$71,$72,$7,$8,$9,$76,$77,$78,$13,$14,$15,$82,$83,$84 

awk 将其视为字符串而不是列标识符。

如何将一串列以正确识别的方式传递给 awk 打印?我正在寻找一种简单的、或多或少的单行解决方案,例如这样:

cols=$(for i in `seq 1 3`; do echo -n "\$$[$[i-1]*6+1],\$$[$[i-1]*6+2],\$$[$[i-1]*6+3],\$$[$[i-1]*6+4+66],\$$[$[i-1]*6+5+66],\$$[$[i-1]*6+6+66],"; done); awk -v cols=${cols%?} '{print cols}' file-testawk > file.out

答案1

awk 没有类似 eval 的功能,但您可以使用awk -f功能(从文件中读取脚本)结合 bash 进程替换来实现一个技巧:

$ a="\$1,\$4"
$ echo "$a"
$1,$4
$ a="{print $a}"
$ echo "$a"
{print $1,$4}
$ awk -f <(echo "$a") <<<"one two three four five"
one four

答案2

用法: ./pass_numbers_to_awk.sh评论里有解释。

#!/bin/bash

#generate random string of numbers - simulation column's numbers 
for i in {1..2}; do
    for j in {1..3}; do
        num=$(( (i-1) * 6 + j ))
        #numbers separated by vertical bar symbol 
        string_of_numbers+="${num}|"
    done
done

# pass to awk string like a "1|2|3|7|8|9|13|14|15|", 
# removing last vertical bar "|"
##
# use the awk split function - for information 
# look at the 'man mawk | grep -A 3 split\(s,A,r\)'
##      
# go through array and print specified columns.

awk -v string_from_bash="${string_of_numbers%?}" '
BEGIN {
    num_of_cols = split(string_from_bash, array_of_columns, "|");
}
{
    for (i = 1; i <= num_of_cols; i++) {

        # Prevent trailing spaces emergence
        OFS = (i > 1) ? " " : ""

        printf "%s%s", OFS, $array_of_columns[i];
    }
    printf "\n";
}' < input.txt

创建用于测试的 input.txt 文件: ./create_table.sh > input.txt

#!/bin/bash

for i in {A..O}; do
    for j in {1..10}; do
        echo -n "column_${j} "
    done
    echo
done

答案3

awk擅长进行这些指数计算,所以:

awk -v N=3 '
   {
   for ( i=1; i<= N; ++i )
      print $((i-1)*6+1), $((i-1)*6+2), $((i-1)*6+3), $((i-1)*6+4+66), $((i-1)*6+5+66), $((i-1)*6+6+66)
   }
' data.file

基本思想是,如果你给 awk 一个存储在变量中的数字i,那么awk可以通过 获取与该数字对应的字段$(i)。 Nowi也可以是一个表达式,就像这里的情况一样。

相关内容