Perl 中的跨行比较

Question 1

为此，您需要的工具是哈希 - 这是 Perl 存储键值对的方式。具体来说 - 我们需要将您的数据预处理为哈希值，以便我们可以“查找”最低值或XXX出现的位置。

幸运的是 - 你的第三个条件看起来像第二个条件的子集 - 如果你只是打印最低值，那么当只有一个时，最低值是相同的。

所以我可能会这样做：

#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;

#read header line, because we don't want to process it; 
#note - diamond operators are 'magic' file handles. 
#they read either piped input on STDIN, or 
#open/read files specified on command line. 
#this is almost exactly like how sed/grep work. 
my $header_line = <>;
#turn the rest of our intput into an array of arrays, split on whitespace/linefeeds. 
my @lines = map { [split] } <>;

#print for diag
print Dumper \@lines;

#this hash tracks if we've 'seen' an XXX
my %skip_type;
#this hash tracks the lowest V2 value. 
my %lowest_v2_for;
foreach my $record (@lines) {
    #we could work with $record ->[0], etc.
    #this is because I think it's more readable this way. 
    my ( $type, $v1, $v2 ) = @$record;

    #find all the lines with "XXX" - store in a hash.
    if ( $v1 eq "XXX" ) {
        $skip_type{$type}++;
    }

    #check if this v2 is the lowest for this particular type. 
    #make a note if it is. 
    if ( not defined $lowest_v2_for{$type}
        or $lowest_v2_for{$type} > $v2 )
    {
        $lowest_v2_for{$type} = $v2;
    }
}

#print for diag - things we are skipping. 
print Dumper \%skip_type;


print $header_line;

#run through our list again, testing the various conditions:
foreach my $record (@lines) {
    my ( $type, $v1, $v2 ) = @$record;

    #skip if it's got an XXX. 
    next if $skip_type{$type};
    #skip if it isn't the lowest value
    next if $lowest_v2_for{$type} < $v2;
    #print otherwise.
    print join( " ", @$record ), "\n";
}

这给出了（更少的一些诊断输出，Dumper如果您不需要，可以随意丢弃）：

Name v1 v2 
Type4 ABC 55
Type5 ABC 99
Type6 DEF 00

Answer

为此，您需要的工具是哈希 - 这是 Perl 存储键值对的方式。具体来说 - 我们需要将您的数据预处理为哈希值，以便我们可以“查找”最低值或XXX出现的位置。

幸运的是 - 你的第三个条件看起来像第二个条件的子集 - 如果你只是打印最低值，那么当只有一个时，最低值是相同的。

所以我可能会这样做：

#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;

#read header line, because we don't want to process it; 
#note - diamond operators are 'magic' file handles. 
#they read either piped input on STDIN, or 
#open/read files specified on command line. 
#this is almost exactly like how sed/grep work. 
my $header_line = <>;
#turn the rest of our intput into an array of arrays, split on whitespace/linefeeds. 
my @lines = map { [split] } <>;

#print for diag
print Dumper \@lines;

#this hash tracks if we've 'seen' an XXX
my %skip_type;
#this hash tracks the lowest V2 value. 
my %lowest_v2_for;
foreach my $record (@lines) {
    #we could work with $record ->[0], etc.
    #this is because I think it's more readable this way. 
    my ( $type, $v1, $v2 ) = @$record;

    #find all the lines with "XXX" - store in a hash.
    if ( $v1 eq "XXX" ) {
        $skip_type{$type}++;
    }

    #check if this v2 is the lowest for this particular type. 
    #make a note if it is. 
    if ( not defined $lowest_v2_for{$type}
        or $lowest_v2_for{$type} > $v2 )
    {
        $lowest_v2_for{$type} = $v2;
    }
}

#print for diag - things we are skipping. 
print Dumper \%skip_type;


print $header_line;

#run through our list again, testing the various conditions:
foreach my $record (@lines) {
    my ( $type, $v1, $v2 ) = @$record;

    #skip if it's got an XXX. 
    next if $skip_type{$type};
    #skip if it isn't the lowest value
    next if $lowest_v2_for{$type} < $v2;
    #print otherwise.
    print join( " ", @$record ), "\n";
}

这给出了（更少的一些诊断输出，Dumper如果您不需要，可以随意丢弃）：

Name v1 v2 
Type4 ABC 55
Type5 ABC 99
Type6 DEF 00

Question 2

我的看法：

perl -wE ' 
    # read the data 
    chomp( my $header = <> ); 
    my %data; 
    while (<>) { 
        chomp; 
        my @F = split; 
        $data{$F[0]}{$F[1]} = $F[2]; 
    } 

    # requirement 1 
    delete $data{Type1} if exists $data{Type1}{XXX}; 

    # requirement 2 
    if (exists $data{Type4}{ABC} and exists $data{Type4}{DEF}) { 
        if ($data{Type4}{ABC} <= $data{Type4}{DEF}) { 
            delete $data{Type4}{DEF}; 
        } 
        else { 
            delete $data{Type4}{ABC}; 
        } 
    } 

    # requirement 3 
    for my $name (qw/Type5 Type6/) { 
        delete $data{$name} unless ( 
            scalar keys %{$data{$name}} == 1 
            and (exists $data{$name}{ABC} or exists $data{$name}{DEF}) 
        ); 
    } 

    $, = " "; 
    say $header; 
    for my $name (sort keys %data) { 
        for my $v1 (sort keys %{$data{$name}}) { 
            say $name, $v1, $data{$name}{$v1}; 
        } 
    } 
' file

输出

Name v1 v2 
Type2 ABC 78
Type2 XXX 23
Type3 DEF 22
Type3 XXX 12
Type4 ABC 55
Type5 ABC 99
Type6 DEF 00

对于Type2和Type3没有要求

Answer

我的看法：

perl -wE ' 
    # read the data 
    chomp( my $header = <> ); 
    my %data; 
    while (<>) { 
        chomp; 
        my @F = split; 
        $data{$F[0]}{$F[1]} = $F[2]; 
    } 

    # requirement 1 
    delete $data{Type1} if exists $data{Type1}{XXX}; 

    # requirement 2 
    if (exists $data{Type4}{ABC} and exists $data{Type4}{DEF}) { 
        if ($data{Type4}{ABC} <= $data{Type4}{DEF}) { 
            delete $data{Type4}{DEF}; 
        } 
        else { 
            delete $data{Type4}{ABC}; 
        } 
    } 

    # requirement 3 
    for my $name (qw/Type5 Type6/) { 
        delete $data{$name} unless ( 
            scalar keys %{$data{$name}} == 1 
            and (exists $data{$name}{ABC} or exists $data{$name}{DEF}) 
        ); 
    } 

    $, = " "; 
    say $header; 
    for my $name (sort keys %data) { 
        for my $v1 (sort keys %{$data{$name}}) { 
            say $name, $v1, $data{$name}{$v1}; 
        } 
    } 
' file

输出

Name v1 v2 
Type2 ABC 78
Type2 XXX 23
Type3 DEF 22
Type3 XXX 12
Type4 ABC 55
Type5 ABC 99
Type6 DEF 00

对于Type2和Type3没有要求

Question 3

有三个不同的任务。一切都可以通过以下方式完成awk：

XXX 之后跳过打印

$1 == "Type1" {if($2 == "XXX")f=1;if(! f)print}
Type4 的最小值

$1 == "Type4" {if(min > $3 || ! min)min = $3} END{print min}
打印选择线

$1$2 ~ "^(Type5|Type6)(ABC|DEF)$"

Answer

有三个不同的任务。一切都可以通过以下方式完成awk：

XXX 之后跳过打印

$1 == "Type1" {if($2 == "XXX")f=1;if(! f)print}
Type4 的最小值

$1 == "Type4" {if(min > $3 || ! min)min = $3} END{print min}
打印选择线

$1$2 ~ "^(Type5|Type6)(ABC|DEF)$"

Perl 中的跨行比较

答案1

答案2

答案3

相关内容