GNU 并行在我的函数中同时执行所有命令

GNU 并行在我的函数中同时执行所有命令

好的,我有一个 bash 函数,应用于多个文件夹:

function task(){
do_thing1
do_thing2
do_thing3
...
}

我想并行运行该函数。到目前为止,我正在使用一个小分叉技巧:

N=4 #core number
for temp_subj in ${raw_dir}/MRST*
do
  ((i=i%N)); ((i++==0)) && wait
  task "$temp_subj" &
done

而且效果很好。但我决定使用“更干净”的东西并使用 GNU 并行:

ls -d ${raw_dir}/MRST* | parallel task {}

问题是它把所有东西都放在并行中,包括我的任务函数中的 do_thing 。它不可避免地会崩溃,因为这些必须以串行方式执行。我尝试以多种方式修改对并行的调用,但似乎没有任何效果。有任何想法吗?

答案1

我认为你的问题与do_thingX

do_thing() { echo Doing "$@"; sleep 1; echo Did "$@"; }
export -f do_thing
do_thing1() { do_thing 1 "$@"; }
do_thing2() { do_thing 2 "$@"; }
do_thing3() { do_thing 3 "$@"; }
# Yes you can name functions ... - it is a bit unconventional, but it works
...() { do_thing ... "$@"; }
export -f do_thing1
export -f do_thing2
export -f do_thing3
export -f ...

function task(){
  do_thing1
  do_thing2
  do_thing3
  ...
}
export -f task
# This should take 4 seconds for a single input
ls ${raw_dir}/MRST* | time parallel task {}

或者您使用的parallel不是 GNU Parallel。检查它是否与 GNU 并行:

$ parallel --version
GNU parallel 20201122
Copyright (C) 2007-2020 Ole Tange, http://ole.tange.dk and Free Software
Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
GNU parallel comes with no warranty.

Web site: https://www.gnu.org/software/parallel

When using programs that use GNU Parallel to process data for publication
please cite as described in 'parallel --citation'.

相关内容