使用 awk 比较两个文件

Question 1

经典之作join：

join -t: -1 2 -2 1 -o 2.1,1.1,1.2 <(sort -t: -k1,1 file1) <(sort -t: -k2,2 file2)

-t:指定冒号作为分隔符。
-1 2file1 的连接字段是第二个
-2 1file2 的连接字段是第一个
-o 2.1,1.1,1.2输出格式。
<(...)-k1,1：两个文件必须在连接字段（和）上排序-k2,2，-t:指定冒号作为的分隔符sort。

Answer

经典之作join：

join -t: -1 2 -2 1 -o 2.1,1.1,1.2 <(sort -t: -k1,1 file1) <(sort -t: -k2,2 file2)

-t:指定冒号作为分隔符。
-1 2file1 的连接字段是第二个
-2 1file2 的连接字段是第一个
-o 2.1,1.1,1.2输出格式。
<(...)-k1,1：两个文件必须在连接字段（和）上排序-k2,2，-t:指定冒号作为的分隔符sort。

Question 2

和awk：

awk -F: 'NR==FNR{a[$1]=$2;next}a[$2]{print $1":"$2":"a[$2]}' file1 file2

输出：

bart:29482164591748:computer
smithers:68468468468464:keyboard
lisa:68468468468464:keyboard

解释：

awk -F:启动 awk 将冒号视为字段分隔符
NR==FNR{}只处理第一个文件
a[$1]=$2;next使用第二个字段的值构建一个由第一个字段索引的数组a，然后跳到下一行
a[$2]{}仅当先前构建的数组的值与当前第二个字段的索引不为空时才进行处理（仅针对 file2 执行此操作，因为next前面表达式中的单词）
print $1":"$2":"a[$2]根据需要打印所有内容

问题编辑后：

awk -F: 'NR==FNR{a[$1]=$2;next}a[$2]{print $1":"$2":"a[$2]}' file2 file1

输出：

bart:29482164591748:computer
 apu:29482164591748:computer
smithers:68468468468464:keyboard
lisa:68468468468464:keyboard

Answer

和awk：

awk -F: 'NR==FNR{a[$1]=$2;next}a[$2]{print $1":"$2":"a[$2]}' file1 file2

输出：

bart:29482164591748:computer
smithers:68468468468464:keyboard
lisa:68468468468464:keyboard

解释：

awk -F:启动 awk 将冒号视为字段分隔符
NR==FNR{}只处理第一个文件
a[$1]=$2;next使用第二个字段的值构建一个由第一个字段索引的数组a，然后跳到下一行
a[$2]{}仅当先前构建的数组的值与当前第二个字段的索引不为空时才进行处理（仅针对 file2 执行此操作，因为next前面表达式中的单词）
print $1":"$2":"a[$2]根据需要打印所有内容

问题编辑后：

awk -F: 'NR==FNR{a[$1]=$2;next}a[$2]{print $1":"$2":"a[$2]}' file2 file1

输出：

bart:29482164591748:computer
 apu:29482164591748:computer
smithers:68468468468464:keyboard
lisa:68468468468464:keyboard

Question 3

不会使用awk，但会使用perl。

#!/usr/bin/env perl
use strict;
use warnings;

#open both files for reading
open( my $input1, '<', "file1.txt" ) or die $!;
open( my $input2, '<', "file2.txt" ) or die $!;

#read the key-values into a hash called lookup. 
my %lookup = do { local $/; <$input1> =~ m/(\d+):(\w+)/g; };

#iterate by line of second file
while ( <$input2> ) { 
    #trim trailing linefeeds
    chomp;
    #split current line on :
    my ( $user, $key ) = split /:/;
    #if exists in original lookup, display record 
    if ( $lookup{$key} ) {
        print join ( ":", $user, $key, $lookup{$key}),"\n";
    }
}

不过，我得到的输出略有不同 - 具体来说：

bart:29482164591748:computer
smithers:68468468468464:keyboard
lisa:68468468468464:keyboard

我不知道为什么第二个2不应该根据匹配的键值进行打印。

如果您想要一款基本相同的单衬：

perl -F: -lane "print $k{$F[0]}.':'.$_ if $k{$F[0]}; $k{$F[1]}//=$F[0];" file2.txt file1.txt

Answer

不会使用awk，但会使用perl。

#!/usr/bin/env perl
use strict;
use warnings;

#open both files for reading
open( my $input1, '<', "file1.txt" ) or die $!;
open( my $input2, '<', "file2.txt" ) or die $!;

#read the key-values into a hash called lookup. 
my %lookup = do { local $/; <$input1> =~ m/(\d+):(\w+)/g; };

#iterate by line of second file
while ( <$input2> ) { 
    #trim trailing linefeeds
    chomp;
    #split current line on :
    my ( $user, $key ) = split /:/;
    #if exists in original lookup, display record 
    if ( $lookup{$key} ) {
        print join ( ":", $user, $key, $lookup{$key}),"\n";
    }
}

不过，我得到的输出略有不同 - 具体来说：

bart:29482164591748:computer
smithers:68468468468464:keyboard
lisa:68468468468464:keyboard

我不知道为什么第二个2不应该根据匹配的键值进行打印。

如果您想要一款基本相同的单衬：

perl -F: -lane "print $k{$F[0]}.':'.$_ if $k{$F[0]}; $k{$F[1]}//=$F[0];" file2.txt file1.txt

使用 awk 比较两个文件

答案1

答案2

答案3

相关内容