我有四个文件;
文件1:
gene_id FPKM
TRINITY_DN56960_c4_g1 36.91
TRINITY_DN42427_c0_g1 12.83
TRINITY_DN48728_c0_g1 6.22
TRINITY_DN50706_c0_g2 6604.75
TRINITY_DN51449_c1_g1 8.19
TRINITY_DN48996_c1_g1 28.21
文件2:
gene_id FPKM
TRINITY_DN48728_c0_g1 6.05
TRINITY_DN50706_c0_g2 5176.34
TRINITY_DN51449_c1_g1 7.58
TRINITY_DN48996_c1_g1 16.47
TRINITY_DN42427_c0_g1 14.03
TRINITY_DN56960_c4_g1 80.91
文件3:
gene_id FPKM
TRINITY_DN56960_c4_g1 90.91
TRINITY_DN42427_c0_g1 24.03
TRINITY_DN51449_c1_g1 6.58
TRINITY_DN48996_c1_g1 26.47
TRINITY_DN48728_c0_g1 7.05
TRINITY_DN50706_c0_g2 4176.34
文件4:
gene_id FPKM
TRINITY_DN50706_c0_g2 9176.34
TRINITY_DN56960_c4_g1 120.91
TRINITY_DN42427_c0_g1 34.03
TRINITY_DN48728_c0_g1 7.05
TRINITY_DN51449_c1_g1 7.58
TRINITY_DN48996_c1_g1 36.5
我想要这样的输出文件:
gene_id FPKM1 FPKM2 FPKM3 FPKM4
TRINITY_DN42427_c0_g1 12.83 14.03 24.03 34.03
TRINITY_DN48728_c0_g1 6.22 6.05 7.05 7.05
TRINITY_DN48996_c1_g1 28.21 16.47 26.47 36.5
TRINITY_DN50706_c0_g2 6604.75 5176.34 4176.34 9176.34
TRINITY_DN51449_c1_g1 8.19 7.58 6.58 7.58
TRINITY_DN56960_c4_g1 36.91 80.91 90.91 120.91
那么我怎样才能做到这一点壳脚本?
答案1
这使用海绵来自 Debian 的更多实用程序包裹。不巴什主义。
根据需要备份四个输入文件(因为这会更改它们),然后:
# enumerate headers of each file, then sort each file, in place
for f in file* ; do
sed '1s/.*/&'"$f"'/;s/file//' $f | sort | sponge $f
done
# join sorted files, output to 'fileN'
for f in 34 N2 N1 ; do join --header file[$f] | sponge fileN ; done
# reformat header, in place
sed -i '1s/d/d /' fileN
cat fileN
输出:
gene_id FPKM1 FPKM2 FPKM3 FPKM4
TRINITY_DN42427_c0_g1 12.83 14.03 24.03 34.03
TRINITY_DN48728_c0_g1 6.22 6.05 7.05 7.05
TRINITY_DN48996_c1_g1 28.21 16.47 26.47 36.5
TRINITY_DN50706_c0_g2 6604.75 5176.34 4176.34 9176.34
TRINITY_DN51449_c1_g1 8.19 7.58 6.58 7.58
TRINITY_DN56960_c4_g1 36.91 80.91 90.91 120.91