Bash 传递输入

Question 1

假设你的文件每行包含一个文件，你可以做这个丑陋的事情：

tool="cgatools join --beta --match <specification> --overlap <overlap_spec> --select <output_fields> --always-dump --output-mode compact --input"

{
    read -r filename
    cmd="cat \"$filename\""
    while read -r filename; do
        cmd+=" | $tool \"$filename\""
    done
} < file_of_filenames

cmd+=" > output_file"

echo "$cmd"
eval "$cmd"

文档说，如果只给出一个输入文件，则从 stdin 读取另一个文件，如果未给出 --output 选项，则将使用 stdout。

未经测试，但这也可能有效（bash）

# declare the cgatools command with options
# stored in a shell array.
cga_join=( 
    cgatools join --beta 
                  --match "specification"
                  --overlap "overlap_spec" 
                  --select "output_fields"
                  --always-dump 
                  --output-mode compact 
)

# the entry point to the join process
# shift the first argument off the list of arguments, and
# pipe its contents into the recursive call
call_join() {
    local first=$1
    shift
    cat "$first" | call_join_recursively "$@"
}

# recursively call "cgatools join"
# input will be read from stdin; output goes to stdout
# if this is the last filename to join, pipe the output through "cat"
# otherwise pipe it into another call to this function, passing the 
# remaining filenames to join.
call_join_recursively() {
    local file=$1
    shift
    local next_command=(cat)
    if [[ $# -gt 0 ]]; then
        next_command=( "$FUNCNAME" "$@" )
    fi
    "${cga_join[@]}" --input "$file" | "${next_command[@]}"
}

# read the list of filenames to join.
# stored in the "filenames" array 
mapfile -t filenames < file_of_filenames

# launch the joining, passing the filenames as individual arguments.
# store the output into a file.
call_join "${filenames[@]}" > output_file

Answer

假设你的文件每行包含一个文件，你可以做这个丑陋的事情：

tool="cgatools join --beta --match <specification> --overlap <overlap_spec> --select <output_fields> --always-dump --output-mode compact --input"

{
    read -r filename
    cmd="cat \"$filename\""
    while read -r filename; do
        cmd+=" | $tool \"$filename\""
    done
} < file_of_filenames

cmd+=" > output_file"

echo "$cmd"
eval "$cmd"

文档说，如果只给出一个输入文件，则从 stdin 读取另一个文件，如果未给出 --output 选项，则将使用 stdout。

未经测试，但这也可能有效（bash）

# declare the cgatools command with options
# stored in a shell array.
cga_join=( 
    cgatools join --beta 
                  --match "specification"
                  --overlap "overlap_spec" 
                  --select "output_fields"
                  --always-dump 
                  --output-mode compact 
)

# the entry point to the join process
# shift the first argument off the list of arguments, and
# pipe its contents into the recursive call
call_join() {
    local first=$1
    shift
    cat "$first" | call_join_recursively "$@"
}

# recursively call "cgatools join"
# input will be read from stdin; output goes to stdout
# if this is the last filename to join, pipe the output through "cat"
# otherwise pipe it into another call to this function, passing the 
# remaining filenames to join.
call_join_recursively() {
    local file=$1
    shift
    local next_command=(cat)
    if [[ $# -gt 0 ]]; then
        next_command=( "$FUNCNAME" "$@" )
    fi
    "${cga_join[@]}" --input "$file" | "${next_command[@]}"
}

# read the list of filenames to join.
# stored in the "filenames" array 
mapfile -t filenames < file_of_filenames

# launch the joining, passing the filenames as individual arguments.
# store the output into a file.
call_join "${filenames[@]}" > output_file

Question 2

我认为您正在寻找像这样的简单迭代解决方案：

#!/bin/sh
( tmpfile=/tmp/result
  read firstfilename
  cat "$firstfilename" >$tmpfile.in
  while read filename
  do cgatools join \
          --beta \
          --input $tmpfile.in "$filename" \
          --match <specification> \
          --overlap <overlap_spec> \
          --select <output_fields> \
          --always-dump \
          --output-mode compact  >$tmpfile.out
     mv $tmpfile.out $tmpfile.in
  done
) < file_of_filenames
echo "result is in $tmpfile.in"

这会从您的文件中逐行读取行（即文件名），并使用该文件名和先前的输出file_of_filenames 运行，生成一个新的输出文件。该输出文件被重命名为输入文件并且循环继续。 cgatools$tmpfile.out$tmpfile.in

为了处理启动，第一个文件名行被单独读取（到变量中firstfilename），并且该文件被复制到输入文件中，以便我们有 2 个文件要加入。由于所有命令都在“()”内，这确保了 while 循环内的读取在第一次读取停止的地方继续进行。

Answer

我认为您正在寻找像这样的简单迭代解决方案：

#!/bin/sh
( tmpfile=/tmp/result
  read firstfilename
  cat "$firstfilename" >$tmpfile.in
  while read filename
  do cgatools join \
          --beta \
          --input $tmpfile.in "$filename" \
          --match <specification> \
          --overlap <overlap_spec> \
          --select <output_fields> \
          --always-dump \
          --output-mode compact  >$tmpfile.out
     mv $tmpfile.out $tmpfile.in
  done
) < file_of_filenames
echo "result is in $tmpfile.in"

这会从您的文件中逐行读取行（即文件名），并使用该文件名和先前的输出file_of_filenames 运行，生成一个新的输出文件。该输出文件被重命名为输入文件并且循环继续。 cgatools$tmpfile.out$tmpfile.in

为了处理启动，第一个文件名行被单独读取（到变量中firstfilename），并且该文件被复制到输入文件中，以便我们有 2 个文件要加入。由于所有命令都在“()”内，这确保了 while 循环内的读取在第一次读取停止的地方继续进行。

Bash 传递输入

答案1

答案2

相关内容