如何选择特定的行范围并计算第一个字符与每个唯一的第二个字符的具体出现次数？

Question

这个 Perl 脚本将一次性完成您想要的操作：

#!/usr/bin/env perl
use strict;
use Getopt::Std;

## This hash will hold the options
my %opts;

## Read the options
getopts('t:s:e:',\%opts) || do { print "Invalid option\n"; exit(1); };

## Keep the temp file if the script is run 
## with -t
my $keep_temp_file=$opts{t}||undef;

## The temp file's file handle
my $tmp;
## The temp file
my $temp_file=`mktemp`;
chomp($temp_file);
## Read the time range
my $start=$opts{s}||undef;
my $end=$opts{e}||undef;


## Open the input file
open($tmp,'<',"$ARGV[0]")|| 
    die("Need an input file as the 1st argument: $!\n");

my ($time,$want);
my (%data,%letters);
## Read the input file
line:while (<$tmp>) {
    ## skip blank lines
    next if /^\s*$/;

    ## remove trailing newlines
    chomp;
    ## Is this line one of the start times?
    if (/^#(\d+)/) {
        if ($1>=$start && $1<=$end) {
            $time=$1;
            $want=1;
        } elsif ($1>=$end) {
            $want=0;
            last line;
        }
    }
    ## If we want this line, save it in
    ## the %data hash.
    if ($want==1) {
        ## Skip if this line is the one that has the time
        ## definition.
        next if /^#/;
        ## Get the two characters of the line
        /^(.)(.+)/;
        $data{$time}{$2}=$1;
        ## Save each letter seen
        $letters{$2}++;
    }  
}
## Once the file has been processed, create
## the temp file.
open($tmp,'>',$temp_file)|| 
    die("Could not open temp file $temp_file for writing: $!\n");

my @times=sort {$a <=> $b } keys(%data);
print $tmp " ";
printf $tmp "%6s", "$_" for @times;
print $tmp "\n";
foreach my $letter (sort keys(%letters)) {
    print $tmp "$letter " ;
    foreach my $time (@times) {
        defined $data{$time}{$letter} ? 
            printf $tmp "%6s","$data{$time}{$letter} " : printf $tmp "%6s","- ";
    }
    print $tmp "\n";
}
close($tmp);
## Process the tmp file to get your desired output
open(my $fh,'<',"$temp_file")|| 
    die("Could not open temp file $temp_file for reading: $!\n");
## Print the header
printf "%-7s%6s%10s\n",'name', 'count', 'x';
while (<$fh>) {
    ## Skip first line
    next if $.==1;

    ## Collect the columns
    my @foo=split(/\s+/);
    ## get the letter
    my $let=shift(@foo);
    my $c=0;
    ## Check if the first one is an x
    $c++ if $foo[0] eq 'x';
    ## Check the rest
    for (my $i=1;$i<=$#foo;$i++) {
        ## Get the previous position. This is complicated
        ## since you want to ignore the non [01x] characters
        my $prev="";
        for (my $k=$i-1; $k>-1; $k--) {
            if ($foo[$k]=~/^[01x]$/) {
                $prev=$foo[$k];
                last;
            }
        }
        ## If this is an x, increment c if 
        ## the previous character was 0 or 1
        if ($foo[$i] eq 'x' && ($prev=~/^[01]$/ || $prev=~/^$/)) {
            $c++;
        }
    } 
    printf "%-7s%6s%10s\n", $let,$c,"x";
}
## If we want to keep the temp file, copy
## it to the file name given.
if ($keep_temp_file) {
    system("cp $temp_file $keep_temp_file");
}
## else, delete it
else {
    unlink($temp_file);
}

如果将其另存为foo.pl，则可以像这样运行：

foo.pl -s 2 -e 30 -t 2-30.temp file

设置-s开始时间，设置-e结束时间。如果您希望保留临时文件，请为其指定一个带有-t.如果没有-t，临时文件将被删除。

在你的例子中，它会产生：

$ perl foo.pl -s 2 -e 30 -t aa file2
name    count         x
:()         1         x
:cg         1         x
b:cg*b      1         x
c           2         x
e           4         x

我回答这个问题是因为这是一个有趣的问题，而你是新来的。但是，请注意，我们不是脚本编写服务。要求如此复杂的解决方案的问题是题外话。我们很乐意帮助您解决特定问题，但我们（通常）不会为您编写整个脚本。

下次，开始写一些东西并将你面临的问题分开。问一个具体的对每个问题提出一个问题，您可以这样构建您的脚本。

Answer 1