无法理解 JOIN 命令

无法理解 JOIN 命令

我有两个文件,我想在某个位置将它们合并。我想使用第一个文件的第一列在第二个文件的第四列将它们合并。这让我很抓狂!

这是我正在尝试的:

join -j4 <(sort -k1 FirstFile.txt) <(sort -k4 SecondFile.txt)

第一个文件.txt:

24.136.152.171 US
24.136.152.171 US
24.136.152.171 US 

第二文件.txt

2014-08-03 00:00:00 User 24.136.152.171
2014-08-03 00:00:00 User 24.136.152.171
2014-08-03 00:00:00 User 24.136.152.171

期望输出:

2014-08-03 00:00:00 User 24.136.152.171 US
2014-08-03 00:00:00 User 24.136.152.171 US
2014-08-03 00:00:00 User 24.136.152.171 US

答案1

的默认输出格式join是先打印连接字段,然后打印 中的剩余字段FILE1,然后打印 中的剩余字段FILE2,除非使用 指定格式-o。此外,该选项-j4意味着连接字段是 FILE1 和 FILE2 中的第 4 个字段。因此您需要拆分-j4-1 1 -2 4

尝试这个:

join -o '2.1 2.2 2.3 2.4 1.2' -2 4 -1 1 <(sort -k1 FirstFile.txt) <(sort -k4 SecondFile.txt)

答案2

您可以使用 python。

join.py将以下内容保存在名为“您的主区域”的文件中:

ffile=open('FirstFile.txt','r').read().split('\n')       # Open the first file, read it and split it into a list at the newline character
sfile=open('SecondFile.txt','r').read().split('\n')      # Open the second file, read it and split it into a list at the newline character
minlen=min(len(ffile),len(sfile))                        # Get the lengths of both, and return the minimum so it doesn't break if they are different lengths.
ofile = [] # Create an empty list.    

for i in range (minlen):                                 # Loop for the length of the shortest list.
    ofile = ofile + [ffile[i]+sfile[i]]                  # Add the first item of the first list (the first line of the first file) to the first item of the second list (the first line of the second file).

outfile=open('outputfile','w')                           # Create an output file, called outputfile.txt in your home directory

outfile.write('\n'.join(ofile))                          # Write to the output file.

然后运行

python join.py

相关内容