所以我尝试运行一个程序 iRep,通常它运行为-
iRep -f Bins/10000A-01-01_bin.* -s sam/10000A-01-01.sam.sorted.sam --sort -o 10000A-01-01_iRep_output
在 sam 文件夹中 -
10000A-01-01.sam.sorted.sam
10000A-01-02.sam.sorted.sam
10000A-01-03.sam.sorted.sam
在 Bins 文件夹中 -
10000A-01-01_bin.1.fa
10000A-01-01_bin.2.fa
10000A-01-01_bin.3.fa
10000A-01-02_bin.1.fa
10000A-01-02_bin.2.fa
10000A-01-02_bin.3.fa
10000A-01-03_bin.1.fa
10000A-01-03_bin.3.fa
10000A-01-03_bin.5.fa
10000A-01-03_bin.7.fa
我想要一个循环,可以在一个命令中完成所有操作,而不是单独为每个示例运行每个命令,例如
iRep -f Bins/10000A-01-01_bin.* -s sam/10000A-01-01.sam.sorted.sam --sort -o 10000A-01-01_iRep_output
iRep -f Bins/10000A-01-02_bin.* -s sam/10000A-01-02.sam.sorted.sam --sort -o 10000A-01-02_iRep_output
iRep -f Bins/10000A-01-03_bin.* -s sam/10000A-01-03.sam.sorted.sam --sort -o 10000A-01-03_iRep_output
知道我该怎么做吗?
答案1
#!/bin/sh
# Loop over the SAM files
for sam in sam/*.sam.sorted.sam; do
# Extract the sample name by taking the basename of the SAM file
# and removing the known filename suffix.
sample=$(basename "$sam" .sam.sorted.sam)
# Call iRep (as described in the question)
iRep -f Bins/"$sample"_bin.* -s "$sam" --sort -o "$sample"_iRep_output
done
鉴于问题中的文件,这最终将运行
iRep -f Bins/10000A-01-01_bin.1.fa Bins/10000A-01-01_bin.2.fa Bins/10000A-01-01_bin.3.fa -s sam/10000A-01-01.sam.sorted.sam --sort -o 10000A-01-01_iRep_output
iRep -f Bins/10000A-01-02_bin.1.fa Bins/10000A-01-02_bin.2.fa Bins/10000A-01-02_bin.3.fa -s sam/10000A-01-02.sam.sorted.sam --sort -o 10000A-01-02_iRep_output
iRep -f Bins/10000A-01-03_bin.1.fa Bins/10000A-01-03_bin.3.fa Bins/10000A-01-03_bin.5.fa -s sam/10000A-01-03.sam.sorted.sam --sort -o 10000A-01-03_iRep_output
答案2
使用 GNU Parallel 看起来像这样:
parallel --plus iRep -f Bins/{/...}_bin.* -s {} --sort -o {/...}_iRep_output ::: sam/*.sam.sorted.sam