我有n每行一个单词的文件
文件1 文件2 文件3 ... 1_a 2_a 3_a 1_b 2_b 3_b 1_c 3_c
我想编写一个 bash 脚本,它获取所有这些文件并生成 n 个单词的所有可能组合(每个文件一个)。
在我的示例中,我想要这个结果:
1_a 2_a 3_a 1_a 2_a 3_b 1_a 2_a 3_c 1_a 2_b 3_a 1_a 2_b 3_b 1_a 2_b 3_c 1_b 2_a 3_a 1_b 2_a 3_b 1_b 2_a 3_c 1_b 2_b 3_a 1_b 2_b 3_b 1_b 2_b 3_c 1_c 2_a 3_a 1_c 2_a 3_b 1_c 2_a 3_c 1_c 2_b 3_a 1_c 2_b 3_b 1_c 2_b 3_c
我尝试用paste 和awk 来做到这一点,但失败了。我怎样才能做到这一点 ?
答案1
您可以使用递归函数在有文件需要处理时调用自身:
#!/bin/bash
process () {
local prefix=$1
local file=$2
shift 2
while read line ; do
if (($#)) ; then # There are still unprocessed files.
process "$prefix $line" "$@"
else # Reading the last file.
printf '%s\n' "$prefix $line"
fi
done < "$file"
}
process '' "$@"
答案2
parallel --line-buffer --keep-order echo :::: file1 :::: file2 :::: file3
https://www.gnu.org/software/parallel/parallel_tutorial.html#multiple-input-sources
答案3
我知道你说过bash
,但这非常适合诸如python 3.3+
import sys
from contextlib import ExitStack
from itertools import product
with ExitStack() as stack:
files = [stack.enter_context(open(f)) for f in sys.argv[1:]]
for x in product(*files):
x = [y.rstrip('\n') for y in x]
print(*x)
将上面的代码放在一个名为的文件中combo.py
并调用它,从而python combo.py file_1 file_2 file_3
生成
1_a 2_a 3_a
1_a 2_a 3_b
1_a 2_a 3_c
1_a 2_b 3_a
1_a 2_b 3_b
1_a 2_b 3_c
1_b 2_a 3_a
1_b 2_a 3_b
1_b 2_a 3_c
1_b 2_b 3_a
1_b 2_b 3_b
1_b 2_b 3_c
1_c 2_a 3_a
1_c 2_a 3_b
1_c 2_a 3_c
1_c 2_b 3_a
1_c 2_b 3_b
1_c 2_b 3_c
答案4
bash 中的大括号扩展为这项工作提供了合适的工具。考虑一个简单的情况,例如:
$ echo {1..3}{a..c}
1a 1b 1c 2a 2b 2c 3a 3b 3c
在你的例子中你会有这样的东西:
$ echo {1_a,1_b,1_c}{2_a,2_b}{3_a,3_b,3_c}
1_a2_a3_a 1_a2_a3_b 1_a2_a3_c 1_a2_b3_a 1_a2_b3_b 1_a2_b3_c 1_b2_a3_a 1_b2_a3_b 1_b2_a3_c 1_b2_b3_a 1_b2_b3_b 1_b2_b3_c 1_c2_a3_a 1_c2_a3_b 1_c2_a3_c 1_c2_b3_a 1_c2_b3_b 1_c2_b3_c
这是正确的,但很难阅读。为了更好地演示,您可以将生成的输出放入数组中,然后打印该数组:
$ combos=({1_a,1_b,1_c}{2_a,2_b}{3_a,3_b,3_c})
$ for i in "${combos[@]}"; do echo "$i"; done
1_a2_a3_a
1_a2_a3_b
1_a2_a3_c
1_a2_b3_a
1_a2_b3_b
1_a2_b3_c
1_b2_a3_a
1_b2_a3_b
1_b2_a3_c
1_b2_b3_a
1_b2_b3_b
1_b2_b3_c
1_c2_a3_a
1_c2_a3_b
1_c2_a3_c
1_c2_b3_a
1_c2_b3_b
1_c2_b3_c
有很多方法可以在每个组合元素之间添加间隙,使它们看起来像:
1_a 2_a 3_a
..
..
但这是另一个问题,您可以单独提出。