使用 awk 统计行中的元素数量

Question 1

awk 'NR>1 {
    count[$1,$2]++;
    rows[$1]++;
    cols[$2]++;
}
END {
    printf("%3s", "");
    for (col in cols) {
        printf("%4s", col);
    }
    printf("\n");
    for (row in rows) {
        printf("%3d", row);
        for (col in cols) {
            printf(" %3d", count[row,col]);
        }
        printf("\n");
    }
}' data

不一定高效或优雅，但它应该相当容易阅读并能完成工作。此外，行和列不一定按排序顺序打印。关键是使用来count[row,col]模拟多维数组，而 awk 并不直接支持该数组。在 Google 上搜索“awk 多维数组”会出现几篇文章，包括这个。

Answer

awk 'NR>1 {
    count[$1,$2]++;
    rows[$1]++;
    cols[$2]++;
}
END {
    printf("%3s", "");
    for (col in cols) {
        printf("%4s", col);
    }
    printf("\n");
    for (row in rows) {
        printf("%3d", row);
        for (col in cols) {
            printf(" %3d", count[row,col]);
        }
        printf("\n");
    }
}' data

不一定高效或优雅，但它应该相当容易阅读并能完成工作。此外，行和列不一定按排序顺序打印。关键是使用来count[row,col]模拟多维数组，而 awk 并不直接支持该数组。在 Google 上搜索“awk 多维数组”会出现几篇文章，包括这个。

Question 2

这是一个 PERL 解决方案：

  perl -e '
    my (%col1, %col2); 
    while(<>){
        chomp; 
        @a=split(/\s+/); ## split line on whitespace
        $col2{$a[1]}++; ## Collect unique values from the 2nd column
        $col1{$a[0]}{$a[1]}++;## Count values per column/line
    } 
    my @l=sort keys %col2; 
    $"="\t"; ## Array record separator, using tabs to deal with variable size input
    print "\t@l\n"; 
    foreach my $c1 (sort keys(%col1)) {## For each column1 value
        print "$c1\t"; 
        my $str;
        for (my $i=0; $i<=$#l; $i++) {
        ## Collect the values for each position or 0 if there is none
        $col1{$c1}{$l[$i]}="0" unless defined($col1{$c1}{$l[$i]});
        $str.="$col1{$c1}{$l[$i]}\t";
        }
    chop($str); ## remove extra \t 
    print "$str\n";
    }' data   >ll

Answer

这是一个 PERL 解决方案：

  perl -e '
    my (%col1, %col2); 
    while(<>){
        chomp; 
        @a=split(/\s+/); ## split line on whitespace
        $col2{$a[1]}++; ## Collect unique values from the 2nd column
        $col1{$a[0]}{$a[1]}++;## Count values per column/line
    } 
    my @l=sort keys %col2; 
    $"="\t"; ## Array record separator, using tabs to deal with variable size input
    print "\t@l\n"; 
    foreach my $c1 (sort keys(%col1)) {## For each column1 value
        print "$c1\t"; 
        my $str;
        for (my $i=0; $i<=$#l; $i++) {
        ## Collect the values for each position or 0 if there is none
        $col1{$c1}{$l[$i]}="0" unless defined($col1{$c1}{$l[$i]});
        $str.="$col1{$c1}{$l[$i]}\t";
        }
    chop($str); ## remove extra \t 
    print "$str\n";
    }' data   >ll

使用 awk 统计行中的元素数量

答案1

答案2

相关内容