选择性合并文件内容

Question

您可以perl从第二个文件的行创建哈希

#!/usr/bin/perl -w

use strict;

BEGIN{ $/ = $\ = "\n"; }

my $stringsfile = shift @ARGV;
open(my $fh, '<:encoding(UTF-8)', $stringsfile)
  or die "Could not open file '$stringsfile' $!";

my %h;

while (defined($_ = <$fh>)) {
    chomp $_;
    $h{$_} = 1;
}

然后将第一个（以及后续）文件的行拆分为连字符分隔的字段，用 grep 查找不在哈希中的字段，然后将它们全部重新连接在一起，并打印 grep 是否返回任何内容：

while (defined($_ = <ARGV>)) { 
    chomp $_;
    my ($x, @F) = split(/_/, $_, 0);
    my @y = grep({not $h{$_};} @F);
    print join('_', $x, @y) if @y;
}

用法：

$ ./foo.pl file2 file1
 1A00.pdb_HEM
 1A01.pdb_HEM
 1A05.pdb_IPM
 1A0F.pdb_GTS
 1A0G.pdb_PMP

注意：如果潜在匹配都在最后，那么使用更简单的方法awk：

awk '
  BEGIN{OFS=FS="_"} 
  NR==FNR {a[$0]++; next} 
  {while ($NF in a) NF--} 
  NF>1 {print}
' file2 file1

对于问题中的示例数据，两种方法都会产生相同的输出。

Answer 1

您可以perl从第二个文件的行创建哈希

#!/usr/bin/perl -w

use strict;

BEGIN{ $/ = $\ = "\n"; }

my $stringsfile = shift @ARGV;
open(my $fh, '<:encoding(UTF-8)', $stringsfile)
  or die "Could not open file '$stringsfile' $!";

my %h;

while (defined($_ = <$fh>)) {
    chomp $_;
    $h{$_} = 1;
}

然后将第一个（以及后续）文件的行拆分为连字符分隔的字段，用 grep 查找不在哈希中的字段，然后将它们全部重新连接在一起，并打印 grep 是否返回任何内容：

while (defined($_ = <ARGV>)) { 
    chomp $_;
    my ($x, @F) = split(/_/, $_, 0);
    my @y = grep({not $h{$_};} @F);
    print join('_', $x, @y) if @y;
}

用法：

$ ./foo.pl file2 file1
 1A00.pdb_HEM
 1A01.pdb_HEM
 1A05.pdb_IPM
 1A0F.pdb_GTS
 1A0G.pdb_PMP

注意：如果潜在匹配都在最后，那么使用更简单的方法awk：

awk '
  BEGIN{OFS=FS="_"} 
  NR==FNR {a[$0]++; next} 
  {while ($NF in a) NF--} 
  NF>1 {print}
' file2 file1

对于问题中的示例数据，两种方法都会产生相同的输出。

选择性合并文件内容

答案1

相关内容