使用awk合并2个文件

使用awk合并2个文件

我想根据第 1、2、3 列合并 2 个文件。我已尝试awk以下命令,但不起作用。

awk 'NR==FNR {h[$1FS$2FS$3]=$4; next}{k=$1FS$2FS$3; if (k in h) print $1,$2,$3,$4,h[k] ;else print $1,$2,$3,$4,"NA"}1' FS=\| OFS=\| file2.txt 

文件1.txt:

Student1|Class 1A|27|20140804 08:16:54
Student2|Class 1B|15|20140804 10:10:10
Student3|Class 1C|17|20140804 15:02:14
Student4|Class 1D|20|20140804 18:02:14
Student5|Class 2D|10|20140804 20:02:14

文件2.txt:

Student1|Class 1A|27|20140805 08:16:54
Student2|Class 1B|15|20140805 10:10:10
Student4|Class 1D|20|20140805 18:02:14
Student5|Class 2D|10|20140805 20:02:14

预期结果:

Student1|Class 1A|27|20140804 08:16:54|20140805 08:16:54
Student2|Class 1B|15|20140804 10:10:10|20140805 10:10:10
Student3|Class 1C|17|20140804 15:02:14|NA
Student4|Class 1D|20|20140804 18:02:14|20140805 18:02:14
Student5|Class 2D|10|20140804 20:02:14|20140805 20:02:14

答案1

尝试这个。与您的示例一样构建一个数组,并仅通过 打印合并的字段END { ... }

$ awk -F\| '{ k=$1 FS $2 FS $3; h[k] = (k in h) ? h[k]=h[k] FS $4 : $0 } END { for(x in h){printf "%s%s\n",h[x],(length(h[x])>38) ? "" : "|NA"}}' file1.txt file2.txt|sort
Student1|Class 1A|27|20140804 08:16:54|20140805 08:16:54
Student2|Class 1B|15|20140804 10:10:10|20140805 10:10:10
Student3|Class 1C|17|20140804 15:02:14|NA
Student4|Class 1D|20|20140804 18:02:14|20140805 18:02:14
Student5|Class 2D|10|20140804 20:02:14|20140805 20:02:14
$

答案2

如果可以安全地假设第一个文件包含完整的键列表,即学生,则您可以先对每个记录进行附加操作。然后,在 期间END,您需要种类键,修剪附加值并用以下内容回填缺失值"NA"

$ awk -F\| '{k=$1 FS $2 FS $3;r[k]=r[k] FS $4;c[k]++}
    END{n=asorti(r,s);
        for(i=1;i<=n;i++){
            print s[i] substr(r[s[i]],1) (++c[s[i]] == ARGC ? "" : FS "NA")
        }
    }' file1.txt file2.txt
Student1|Class 1A|27|20140804 08:16:54|20140805 08:16:54
Student2|Class 1B|15|20140804 10:10:10|20140805 10:10:10
Student3|Class 1C|17|20140804 15:02:14|NA
Student4|Class 1D|20|20140804 18:02:14|20140805 18:02:14
Student5|Class 2D|10|20140804 20:02:14|20140805 20:02:14

我正在使用++c[s[i]] == ARGC, whereARGC告诉使用文件数 + 1 (即awk命令本身)来进行回填比较。

答案3

我试过了,它有效

/usr/xpg4/bin/awk 'NR==FNR {h[$1,$2,$3]=$4; next}{print $0,($1,$2,$3) in h?h[$1,$2,$3]:"NA"}' FS=\| OFS=\| file2.txt file1.txt

相关内容