我有多个文件,比方说 file1、file2 等。每个文件每一行都有一个单词,例如:
file1 file2 file3
one four six
two five
three
我想要的是将它们以每种可能的排列(不重复)成对组合在一个新文件4中。喜欢
onetwo
onethree
onefour
onefive
...
twothree
...
onefour
...
fourone
...
使用 Linux 命令怎么可能做到这一点?
答案1
ruby 对于这类事情来说是一种很好的简洁语言
ruby -e '
words = ARGV.collect {|fname| File.readlines(fname)}.flatten.map(&:chomp)
words.combination(2).each {|pair| puts pair.join("")}
' file[123] > file4
onetwo
onethree
onefour
onefive
onesix
twothree
twofour
twofive
twosix
threefour
threefive
threesix
fourfive
foursix
fivesix
你说得很对,combination
提供了“onetwo”但错过了“twoone”。好东西有permutation
ruby -e '
words = ARGV.collect {|fname| File.readlines(fname)}.flatten.map(&:chomp)
words.permutation(2).each {|pair| puts pair.join("")}
' file{1,2,3}
onetwo
onethree
onefour
onefive
onesix
twoone
twothree
twofour
twofive
twosix
threeone
threetwo
threefour
threefive
threesix
fourone
fourtwo
fourthree
fourfive
foursix
fiveone
fivetwo
fivethree
fivefour
fivesix
sixone
sixtwo
sixthree
sixfour
sixfive
答案2
假设输入文件的总大小小于getconf ARG_MAX
, (即最大命令行长度),那么这应该可以工作:
set -- $( cat file[123] )
for f in $@ ; do
for g in $@ ; do
[ "$f" != "$g" ] && echo $f$g
done
done > file4
cat file4
输出:
onetwo
onethree
onefour
onefive
onesix
twoone
twothree
twofour
twofive
twosix
threeone
threetwo
threefour
threefive
threesix
fourone
fourtwo
fourthree
fourfive
foursix
fiveone
fivetwo
fivethree
fivefour
fivesix
sixone
sixtwo
sixthree
sixfour
sixfive
(根据OP的澄清,以上是对不重复的排列。请参阅之前的草案 不重复的组合.)
答案3
一个python
办法:
import fileinput
from itertools import permutations
from contextlib import closing
with closing(fileinput.input(['file1', 'file2', 'file3'])) as f:
for x, y in permutations(f, 2):
print '{}{}'.format(x.rstrip('\n'), y.rstrip('\n'))
onetwo
onethree
onefour
onefive
onesix
twoone
twothree
twofour
twofive
twosix
threeone
threetwo
threefour
threefive
threesix
fourone
fourtwo
fourthree
fourfive
foursix
fiveone
fivetwo
fivethree
fivefour
fivesix
sixone
sixtwo
sixthree
sixfour
sixfive
答案4
TXR 口齿不清:
热身:先获取数据结构:
$ txr -p '(comb (get-lines (open-files *args*)) 2)' file1 file2 file3
(("one" "two") ("one" "three") ("one" "four") ("one" "five") ("one" "six")
("two" "three") ("two" "four") ("two" "five") ("two" "six") ("three" "four")
("three" "five") ("three" "six") ("four" "five") ("four" "six")
("five" "six"))
现在只需获得正确的输出格式即可。如果我们将这些对连接在一起然后使用tprint
(通过选项隐式-t
),我们就在那里。
首先,通过映射进行串联cat-str
:
$ txr -p '[mapcar cat-str (comb (get-lines (open-files *args*)) 2)]' file1 file2 file3
("onetwo" "onethree" "onefour" "onefive" "onesix" "twothree" "twofour"
"twofive" "twosix" "threefour" "threefive" "threesix" "fourfive"
"foursix" "fivesix")
好的,我们有正确的数据。现在只需使用tprint
函数 ( -t
) 而不是prinl
( -p
):
$ txr -t '[mapcar cat-str (comb (get-lines (open-files *args*)) 2)]' file1 file2 file3
onetwo
onethree
onefour
onefive
onesix
twothree
twofour
twofive
twosix
threefour
threefive
threesix
fourfive
foursix
fivesix
最后,我们再次阅读问题并根据需要进行排列而不是组合 withperm
而不是:comb
$ txr -t '[mapcar cat-str (perm (get-lines (open-files *args*)) 2)]' file1 file2 file3
onetwo
onethree
onefour
onefive
onesix
twoone
twothree
twofour
twofive
twosix
threeone
threetwo
threefour
threefive
threesix
fourone
fourtwo
fourthree
fourfive
foursix
fiveone
fivetwo
fivethree
fivefour
fivesix
sixone
sixtwo
sixthree
sixfour
sixfive