我有一张表,显示有多少 ASV(列)聚集在一个 OTU(行)中。每个 ASV 由值 1 表示。
#OTUID ASV_1 ASV_2 ASV_3 ASV_4 ASV_5 ASV_6 ASV_7 ASV_8 ASV_9 ASV_10
OTU1 1 0 0 1 0 0 0 0 0 1
OTU2 0 1 0 0 1 0 0 0 0 0
OTU3 0 0 0 0 0 1 0 1 1 0
我想将该表总结如下:
#OTUID ASVs
OTU1 ASV_1, ASV_4, ASV_10
OTU2 ASV_2, ASV_5
OTU3 ASV_6, ASV_8, ASV_9
请帮忙。
答案1
以下脚本假设您要打印该列名字对于每个输入行(第一个标题行之后)上具有以下内容的所有列价值 1
。
#!/usr/bin/perl
use strict;
my @titles=();
while(<>) {
if ($. == 1) {
@titles = split; # get column titles
print "#OTUID\tASVs\n"; # print the new output header
next;
};
chomp;
my @F=split; # split the input line into fields, store in array @F
my @ASVs=(); # @ASV array holds the titles for each matching field.
foreach my $asv (1..$#F) {
push @ASVs, $titles[$asv] if ($F[$asv] == 1);
};
print "$F[0]\t", join(",", @ASVs), "\n";
}
将其另存为,例如alex.pl
,使其可执行并chmod +x alex.pl
像这样运行它:
$ ./alex.pl input.txt
#OTUID ASVs
OTU1 ASV_1,ASV_4,ASV_10
OTU2 ASV_2,ASV_5
OTU3 ASV_6,ASV_8,ASV_9
答案2
$ perl -lane '$,="\t";
$. == 1 and do{ $h{$_} = $F[$_] for 1..$#F; print $F[0], "ASVs"; next; };
print $F[0], join ", ", map { $h{$_} } grep { $F[$_] == 1 } 1..$#F;
' file
结果:
#OTUID ASVs
OTU1 ASV_1, ASV_4, ASV_10
OTU2 ASV_2, ASV_5
OTU3 ASV_6, ASV_8, ASV_9