使用gnu并行后无法识别变量值?

使用gnu并行后无法识别变量值?

我有一个下面的 shell 脚本,我试图从中并行复制 5 个文件。我正在运行下面的 shell 脚本machineA,尝试从 machineB 和 machineC 复制文件。

如果文件不存在于 中,那么它肯定machineB应该存在于 中。machineC

我在这里使用 GNU Parallel 并行下载五个文件。

#!/bin/bash

readonly PRIMARY=/tech01/primary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/techbat/data/be_t1_snapshot
PRIMARY_PARTITION=(550 274 2 546 278 6 558 282 10 554 286 14) # this will have more file numbers

dir1=/techbat/data/be_t1_snapshot/20140501

find "$PRIMARY" -mindepth 1 -delete

do_copy() {
  el=$1
  scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@${FILERS_LOCATION[0]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@${FILERS_LOCATION[1]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/.
}
export -f do_copy
parallel -j 5 do_copy ::: "${PRIMARY_PARTITION[@]}"

问题陈述:-

我上面的脚本面临的问题是 - 它无法识别${FILERS_LOCATION[0]}${FILERS_LOCATION[1]}$dir1$PRIMARY内部do_copy方法?我不知道为什么?

如果我尝试像这样的内部do_copy方法打印出来,什么都不会打印出来?

  echo ${FILERS_LOCATION[0]}    
  echo ${FILERS_LOCATION[1]}

但是如果我在上面的方法上打印出相同的内容do_copy,那么它工作正常吗?

我在这里缺少什么吗?

更新:-

下面是我正在使用的代码 -

#!/bin/bash

export PRIMARY=/tech01/primary
export FILERS_LOCATION=(machineB machineC)
export MEMORY_MAPPED_LOCATION=/techbat/data/be_t1_snapshot
PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280)

export dir1=/techbat/data/be_t1_snapshot/20140501

find "$PRIMARY" -mindepth 1 -delete

do_copy() {
  el=$1
  scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@${FILERS_LOCATION[0]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@${FILERS_LOCATION[1]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/.
}
export -f do_copy
parallel -j 8 do_copy ::: "${PRIMARY_PARTITION[@]}"

另一个更新:-

这是我运行以下脚本后得到的结果 -

#!/bin/bash

export PRIMARY=/tech01/primary
export FILERS_LOCATION=(slc4b03c-407d.stratus.slc.ebay.com chd1b02c-0db8.stratus.phx.ebay.com)
export MEMORY_MAPPED_LOCATION=/techbat/data/be_t1_snapshot
PRIMARY_PARTITION=(0 548 272 4 544)

export dir1=/techbat/data/be_t1_snapshot/20140501

find "$PRIMARY" -mindepth 1 -delete

 echo ${FILERS_LOCATION[0]}    
 echo ${FILERS_LOCATION[1]}

do_copy() {
  el=$1
  echo "scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 bullseye@${FILERS_LOCATION[0]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 bullseye@${FILERS_LOCATION[1]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/."
}
export -f do_copy
parallel -j 3 do_copy ::: "${PRIMARY_PARTITION[@]}"

我得到的输出 -

david@tvxdbx1143:/home/david$ ./scp_files5.sh
machineB
machineC
When using programs that use GNU Parallel to process data for publication please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; and it won't cost you a cent.

To silence this citation notice run 'parallel --bibtex' once or use '--no-notice'.

scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_0_200003_5.data /tech01/primary/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_0_200003_5.data /tech01/primary/.
scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_548_200003_5.data /tech01/primary/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_548_200003_5.data /tech01/primary/.
scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_272_200003_5.data /tech01/primary/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_272_200003_5.data /tech01/primary/.
scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_4_200003_5.data /tech01/primary/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_4_200003_5.data /tech01/primary/.
scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_544_200003_5.data /tech01/primary/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@:/techbat/data/be_t1_snapshot/20140501/t1_weekly_1680_544_200003_5.data /tech01/primary/.

答案1

尝试导出它们并将数组删除为bash 无法导出数组:

export PRIMARY=/data01/primary
export FILERS_LOCATION_1=machineB
export FILERS_LOCATION_2=machineC
export MEMORY_MAPPED_LOCATION=/bexbat/data/be_t1_snapshot

export dir1=/bexbat/data/be_t1_snapshot/20140501

或者简单地将所有常量变量放入函数中:

#!/bin/bash

PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280)

PRIMARY=/data01/primary
find "$PRIMARY" -mindepth 1 -delete

do_copy() {
  el=$1

  PRIMARY=/data01/primary
  FILERS_LOCATION=(machineB machineC)
  MEMORY_MAPPED_LOCATION=/bexbat/data/be_t1_snapshot

  dir1=/bexbat/data/be_t1_snapshot/20140501

  scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@${FILERS_LOCATION[0]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/. || scp -o ControlMaster=auto -o 'ControlPath=~/.ssh/control-%r@%h:%p' -o ControlPersist=900 david@${FILERS_LOCATION[1]}:$dir1/t1_weekly_1680_"$el"_200003_5.data $PRIMARY/.
}
export -f do_copy
parallel -j 8 do_copy ::: "${PRIMARY_PARTITION[@]}"

根据您要复制的文件类型,您应该查看rsync -z 而不是scp.并考虑运行parallel --bibtex一次(如并行建议)。

答案2

您已导出函数,但未导出尝试在函数中直接使用的变量。

parallel每次运行都会启动一个新的 shell do_copy,并且在该 shell 中变量将被解释并且不存在。

如果-s SERVER使用该选项,该选项将从初始shell--env VAR复制到命令运行的远程环境:VARparallel

parallel -j 5 -S localhost --env do_copy --env PRIMARY --env FILERS_LOCATION do_copy ::: "${PRIMARY_PARTITION[@]}"

您可能能够摆脱上述 localhost hack,因为我看不到一种简单的方法来将多服务器逻辑实现到并行服务器-S选项中(除非您保证一台服务器不会有该文件?)

更好的方法是按照 Ole 建议导出变量,或者将所有需要的值作为参数传递给do_copy函数

相关内容