连接两个文件，在某些列中添加值

Question 1

使用awk

awk 'NR==FNR{ seen[$1FS$2]=$3FS$4; next } { print $0, seen[$6FS$7] }' file2 file1

并从输出中删除空行：

awk 'NR==FNR{ seen[$1FS$2]=$3FS$4; next } NF{ print $0, seen[$6FS$7] }' file2 file1

或者使用少量空格和合理的变量名对可读性有很大帮助。此外，在数组键中使用逗号

awk '
    NR == FNR {value[$1,$2] = $3 OFS $4; next} 
    {print $0, value[$6,$7]}
' file2.txt file1.txt

NR当读取第一条记录时设置为 1awk并且对单个或多个输入文件中的下一个记录进行递增读取，直到所有记录都读取完成。
FNR当读取第一条记录时设置为 1awk并且对当前文件中的每个下一个记录读取增加，并且如果有多个输入文件，则将下一个输入文件重置回 1。
所以NR == FNR始终是一个真实条件，并且其后跟的块将仅对第一个文件执行操作。
是seen一个关联的 awk 数组，其键组合为 column$1 和 column$2，值为 column$3 和 column$4。
令牌next跳过执行其余的命令，这些命令实际上只会针对除第一个之外的下一个文件执行。
NF；预设否数量F记录中的字段已知，并用F领域年代eparator FS；因此FS列之间用于完整的字段分隔符，或者您可以改用,数组内的逗号。
因此，当该记录不是空行时，NF{ print $0, seen[$6FS$7] }打印文件 1 中的当前记录以及与数组中存在的列 $6 和列 $7 匹配的值。$0

Answer

使用awk

awk 'NR==FNR{ seen[$1FS$2]=$3FS$4; next } { print $0, seen[$6FS$7] }' file2 file1

并从输出中删除空行：

awk 'NR==FNR{ seen[$1FS$2]=$3FS$4; next } NF{ print $0, seen[$6FS$7] }' file2 file1

或者使用少量空格和合理的变量名对可读性有很大帮助。此外，在数组键中使用逗号

awk '
    NR == FNR {value[$1,$2] = $3 OFS $4; next} 
    {print $0, value[$6,$7]}
' file2.txt file1.txt

NR当读取第一条记录时设置为 1awk并且对单个或多个输入文件中的下一个记录进行递增读取，直到所有记录都读取完成。
FNR当读取第一条记录时设置为 1awk并且对当前文件中的每个下一个记录读取增加，并且如果有多个输入文件，则将下一个输入文件重置回 1。
所以NR == FNR始终是一个真实条件，并且其后跟的块将仅对第一个文件执行操作。
是seen一个关联的 awk 数组，其键组合为 column$1 和 column$2，值为 column$3 和 column$4。
令牌next跳过执行其余的命令，这些命令实际上只会针对除第一个之外的下一个文件执行。
NF；预设否数量F记录中的字段已知，并用F领域年代eparator FS；因此FS列之间用于完整的字段分隔符，或者您可以改用,数组内的逗号。
因此，当该记录不是空行时，NF{ print $0, seen[$6FS$7] }打印文件 1 中的当前记录以及与数组中存在的列 $6 和列 $7 匹配的值。$0

Question 2

我知道您没有要求数据库解决方案，但是如果您碰巧有 MySQL 服务器，可以按以下方法操作：

create table file1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int, c7 int, c8 int);
create table file2 (c1 int, c2 int, c3 char, c4 char);
load data infile 'file1' into table file1 fields terminated by ' ';
load data infile 'file2' into table file2 fields terminated by ' ';
select f1.*, f2.c3, f2.c4 from file1 as f1 
    join file2 as f2 
        on f1.c6 = f2.c1 and f1.c7 = f2.c2 
    order by f1.c1;

（我还必须删除空白行）

结果：

+------+------+------+------+------+------+------+------+------+------+
| c1   | c2   | c3   | c4   | c5   | c6   | c7   | c8   | c3   | c4   |
+------+------+------+------+------+------+------+------+------+------+
|    1 |    1 |    1 |    1 |    1 |    5 |    9 |    1 | A    | B    |
|    2 |    2 |    2 |    2 |    2 |    7 |    8 |    2 | C    | D    |
|    3 |    3 |    3 |    3 |    3 |    7 |    7 |    3 | G    | H    |
|    4 |    4 |    4 |    4 |    4 |    8 |    6 |    4 | E    | F    |
+------+------+------+------+------+------+------+------+------+------+
4 rows in set (0,00 sec)

Answer

我知道您没有要求数据库解决方案，但是如果您碰巧有 MySQL 服务器，可以按以下方法操作：

create table file1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int, c7 int, c8 int);
create table file2 (c1 int, c2 int, c3 char, c4 char);
load data infile 'file1' into table file1 fields terminated by ' ';
load data infile 'file2' into table file2 fields terminated by ' ';
select f1.*, f2.c3, f2.c4 from file1 as f1 
    join file2 as f2 
        on f1.c6 = f2.c1 and f1.c7 = f2.c2 
    order by f1.c1;

（我还必须删除空白行）

结果：

+------+------+------+------+------+------+------+------+------+------+
| c1   | c2   | c3   | c4   | c5   | c6   | c7   | c8   | c3   | c4   |
+------+------+------+------+------+------+------+------+------+------+
|    1 |    1 |    1 |    1 |    1 |    5 |    9 |    1 | A    | B    |
|    2 |    2 |    2 |    2 |    2 |    7 |    8 |    2 | C    | D    |
|    3 |    3 |    3 |    3 |    3 |    7 |    7 |    3 | G    | H    |
|    4 |    4 |    4 |    4 |    4 |    8 |    6 |    4 | E    | F    |
+------+------+------+------+------+------+------+------+------+------+
4 rows in set (0,00 sec)

Question 3

回应@Jos 的回答：sqlite

db=$(mktemp)
sqlite3 "$db" <<'END'
create table f1 (v1 text,v2 text,v3 text,v4 text,v5 text,v6 text,v7 text,v8 text);
create table f2 (v1 text,v2 text,v3 text,v4 text);
.separator " "
.import file1.txt f1
.import file2.txt f2
select f1.*, f2.v3, f2.v4 from f1,f2 where f1.v6=f2.v1 and f1.v7=f2.v2;
END
rm "$db"

或者用几乎一行的方式：

sqlite3 -separator " "  <<'END'
create table f1 (v1, v2, v3, v4, v5, v6, v7, v8 );
create table f2 (v1, v2, v3, v4);
.import file1.txt f1
.import file2.txt f2
select f1.*, f2.v3, f2.v4 from f1,f2 where f1.v6=f2.v1 and f1.v7=f2.v2;
END

Answer

回应@Jos 的回答：sqlite

db=$(mktemp)
sqlite3 "$db" <<'END'
create table f1 (v1 text,v2 text,v3 text,v4 text,v5 text,v6 text,v7 text,v8 text);
create table f2 (v1 text,v2 text,v3 text,v4 text);
.separator " "
.import file1.txt f1
.import file2.txt f2
select f1.*, f2.v3, f2.v4 from f1,f2 where f1.v6=f2.v1 and f1.v7=f2.v2;
END
rm "$db"

或者用几乎一行的方式：

sqlite3 -separator " "  <<'END'
create table f1 (v1, v2, v3, v4, v5, v6, v7, v8 );
create table f2 (v1, v2, v3, v4);
.import file1.txt f1
.import file2.txt f2
select f1.*, f2.v3, f2.v4 from f1,f2 where f1.v6=f2.v1 and f1.v7=f2.v2;
END

Question 4

虽然我很确定有人会想出更好的单行awk解决方案，但这种方法还是可行的。

cp file1.txt output.txt &&
while read -r file2_line; do
    # Empty line --> continue
    [[ -z "$file2_line" ]] && continue
    # Find matching line
    file1_matching_line=$(grep -n "$(echo "$file2_line" | cut -d' ' -f 1,2)" <(cut -d' ' -f6,7 output.txt) | grep -Po "^[0-9]+");
    # no find? continue!
    [[ ! $? -eq 0 ]] && continue
    # Add the fields 3 and 4 of file2 to the end of the matching line of output.txt
    echo "$file1_matching_line" | while read -r ml; do
        sed -i "${ml}s/$/ $(echo "$file2_line" | cut -d' ' -f 3,4)/" output.txt
    done
done < file2.txt && cat output.txt

神奇的事情发生在这一行：

file1_matching_line=[...]

-n查找文件 2 中所有出现的字段 1 和 2 的行号 ( )

$(echo "$file2_line" | cut -d' ' -f 1,2)

在 output.txt 中，它是 file1.txt 的副本。

<(cut -d' ' -f6,7 output.txt)

Answer

虽然我很确定有人会想出更好的单行awk解决方案，但这种方法还是可行的。

cp file1.txt output.txt &&
while read -r file2_line; do
    # Empty line --> continue
    [[ -z "$file2_line" ]] && continue
    # Find matching line
    file1_matching_line=$(grep -n "$(echo "$file2_line" | cut -d' ' -f 1,2)" <(cut -d' ' -f6,7 output.txt) | grep -Po "^[0-9]+");
    # no find? continue!
    [[ ! $? -eq 0 ]] && continue
    # Add the fields 3 and 4 of file2 to the end of the matching line of output.txt
    echo "$file1_matching_line" | while read -r ml; do
        sed -i "${ml}s/$/ $(echo "$file2_line" | cut -d' ' -f 3,4)/" output.txt
    done
done < file2.txt && cat output.txt

神奇的事情发生在这一行：

file1_matching_line=[...]

-n查找文件 2 中所有出现的字段 1 和 2 的行号 ( )

$(echo "$file2_line" | cut -d' ' -f 1,2)

在 output.txt 中，它是 file1.txt 的副本。

<(cut -d' ' -f6,7 output.txt)

连接两个文件，在某些列中添加值

答案1

答案2

答案3

答案4

相关内容