如何合并或加入两个以上的文件?

如何合并或加入两个以上的文件?

我有四个文件;

文件1:

gene_id FPKM
TRINITY_DN56960_c4_g1 36.91
TRINITY_DN42427_c0_g1 12.83
TRINITY_DN48728_c0_g1 6.22
TRINITY_DN50706_c0_g2 6604.75
TRINITY_DN51449_c1_g1 8.19
TRINITY_DN48996_c1_g1 28.21

文件2:

gene_id FPKM
TRINITY_DN48728_c0_g1 6.05
TRINITY_DN50706_c0_g2 5176.34
TRINITY_DN51449_c1_g1 7.58
TRINITY_DN48996_c1_g1 16.47
TRINITY_DN42427_c0_g1 14.03
TRINITY_DN56960_c4_g1 80.91 

文件3:

gene_id FPKM
TRINITY_DN56960_c4_g1 90.91
TRINITY_DN42427_c0_g1 24.03
TRINITY_DN51449_c1_g1 6.58
TRINITY_DN48996_c1_g1 26.47
TRINITY_DN48728_c0_g1 7.05
TRINITY_DN50706_c0_g2 4176.34

文件4:

gene_id FPKM
TRINITY_DN50706_c0_g2 9176.34
TRINITY_DN56960_c4_g1 120.91
TRINITY_DN42427_c0_g1 34.03
TRINITY_DN48728_c0_g1 7.05
TRINITY_DN51449_c1_g1 7.58
TRINITY_DN48996_c1_g1 36.5

我想要这样的输出文件:

gene_id               FPKM1 FPKM2 FPKM3 FPKM4
TRINITY_DN42427_c0_g1 12.83 14.03 24.03 34.03
TRINITY_DN48728_c0_g1 6.22 6.05 7.05 7.05
TRINITY_DN48996_c1_g1 28.21 16.47 26.47 36.5
TRINITY_DN50706_c0_g2 6604.75 5176.34 4176.34 9176.34
TRINITY_DN51449_c1_g1 8.19 7.58 6.58 7.58
TRINITY_DN56960_c4_g1 36.91 80.91 90.91 120.91

那么我怎样才能做到这一点脚本?

答案1

这使用海绵来自 Debian 的更多实用程序包裹。不巴什主义。

根据需要备份四个输入文件(因为这会更改它们),然后:

# enumerate headers of each file, then sort each file, in place
for f in file* ; do
    sed '1s/.*/&'"$f"'/;s/file//' $f | sort | sponge $f
done
# join sorted files, output to 'fileN'
for f in 34 N2 N1 ; do join --header file[$f] | sponge fileN ; done
# reformat header, in place
sed -i '1s/d/d              /' fileN
cat fileN

输出:

gene_id               FPKM1 FPKM2 FPKM3 FPKM4
TRINITY_DN42427_c0_g1 12.83 14.03 24.03 34.03
TRINITY_DN48728_c0_g1 6.22 6.05 7.05 7.05
TRINITY_DN48996_c1_g1 28.21 16.47 26.47 36.5
TRINITY_DN50706_c0_g2 6604.75 5176.34 4176.34 9176.34
TRINITY_DN51449_c1_g1 8.19 7.58 6.58 7.58
TRINITY_DN56960_c4_g1 36.91 80.91 90.91 120.91

相关内容