如何使用线程/并行运行 bash 脚本

Question 1

我认为并行命令的一个好方法可能是......GNUparallel。将您的块定义为函数格伦·杰克曼建议然后并行运行它们，其中-j允许定义最大并行运行。相对于 Glenn 的方法的优点是，如果完成了任何一个块，就会启动一个新的块，因此您总是同时运行 10 个块。请注意，您必须导出并行函数才能看到它。

#!/bin/bash
g++ -std=c++11 shrink_files.cpp -o _shrink

block() {
  prefetch "$1"
  fastq-dump --fasta 0 "$1"
  ...
}

export -f block

parallel -j 10 block ::: id1 id2 id3 id4 ....

替换idi为您的 id。或者，如果您在类似文件中拥有 ID（SRR837459 等）

id1
id2
id3

然后使用这一行（如果没有其他输入源，GNU Parallel 从标准输入读取）：

parallel -j 10 block

并通过运行脚本

bash script < idlist.txt

Answer

我认为并行命令的一个好方法可能是......GNUparallel。将您的块定义为函数格伦·杰克曼建议然后并行运行它们，其中-j允许定义最大并行运行。相对于 Glenn 的方法的优点是，如果完成了任何一个块，就会启动一个新的块，因此您总是同时运行 10 个块。请注意，您必须导出并行函数才能看到它。

#!/bin/bash
g++ -std=c++11 shrink_files.cpp -o _shrink

block() {
  prefetch "$1"
  fastq-dump --fasta 0 "$1"
  ...
}

export -f block

parallel -j 10 block ::: id1 id2 id3 id4 ....

替换idi为您的 id。或者，如果您在类似文件中拥有 ID（SRR837459 等）

id1
id2
id3

然后使用这一行（如果没有其他输入源，GNU Parallel 从标准输入读取）：

parallel -j 10 block

并通过运行脚本

bash script < idlist.txt

Question 2

您可以对命令进行分组以在后台运行

未经测试的

#!/bin/bash
g++ -std=c++11 shrink_files.cpp -o _shrink

block() {
    local id=$1
    prefetch "$id"
    fastq-dump --fasta 0 "$id"
    rm ../../sra_sequences/sra/"$id".sra
    ./_shrink "$id"
    rm "$id".fasta
    echo "$id" >> list_of_done.txt
}

# run first 10 blocks in the background
block SRR837459 &
block SRR805782 &
...
block 10th_id &

# and wait for them to complete
wait

# now start up on the next 10 ...
block 11th_id &
...

或者下面的代码更具编程性：

# all the ids to fetch
ids=( SRR837459 SRR805782 ... )

while (( ${#ids[@]} > 0 )); do
    # copy the first 10 as positional parameters
    set -- "${ids[@]:0:10}"
    # launch the blocks
    for id; do
        block "$id" &
    done
    # wait until done
    wait
    # remove the first 10 from the list
    ids=( "${ids[@]:10}" )
done

Answer

您可以对命令进行分组以在后台运行

未经测试的

#!/bin/bash
g++ -std=c++11 shrink_files.cpp -o _shrink

block() {
    local id=$1
    prefetch "$id"
    fastq-dump --fasta 0 "$id"
    rm ../../sra_sequences/sra/"$id".sra
    ./_shrink "$id"
    rm "$id".fasta
    echo "$id" >> list_of_done.txt
}

# run first 10 blocks in the background
block SRR837459 &
block SRR805782 &
...
block 10th_id &

# and wait for them to complete
wait

# now start up on the next 10 ...
block 11th_id &
...

或者下面的代码更具编程性：

# all the ids to fetch
ids=( SRR837459 SRR805782 ... )

while (( ${#ids[@]} > 0 )); do
    # copy the first 10 as positional parameters
    set -- "${ids[@]:0:10}"
    # launch the blocks
    for id; do
        block "$id" &
    done
    # wait until done
    wait
    # remove the first 10 from the list
    ids=( "${ids[@]:10}" )
done

如何使用线程/并行运行 bash 脚本

答案1

答案2

相关内容