我有两个文件。
文件1:
Dave 734.838.9800
Bob 313.123.4567
Carol 248.344.5576
Mary 313.449.1390
Ted 248.496.2204
Alice 616.556.4458
文件2:
Bob Tuesday
Carol Monday
Ted Sunday
Alice Wednesday
Dave Thursday
Mary Saturday
我合并了两个文件。
file3 应该是这样的:
Name On-Call Phone
Carol MONDAY 248.344.5576
Bob TUESDAY 313.123.4567
Alice WEDNESDAY 616.556.4458
Dave THURSDAY 734.838.9800
Nobody FRIDAY 634.296.3356
Mary SATURDAY 313.449.1390
Ted SUNDAY 248.496.2204
但我无法让工作日井井有条。我该如何去做呢?
答案1
这是另一种方法(简短版本,没有临时文件):
{ printf %s\\n "Name On-Call Phone";
join -a1 -j2 -o 1.1 2.1 1.2 2.3 -e "Nobody" \
<(printf %s\\n '5 Friday' '1 Monday' '6 Saturday' '7 Sunday' '4 Thursday' '2 Tuesday' '3 Wednesday') \
<(join <(sort file2) <(sort file1) | sort -k2) | sort -k2n | sort -k1n | \
cut -d' ' -f 2-; } | column -t
如果您绝对需要大写的日期名称,那么:
{ printf %s\\n "Name On-Call Phone"; join -a1 -j2 -o 1.1 2.1 1.3 2.3 -e "Nobody" <(cat <<IN
5 Friday FRIDAY
1 Monday MONDAY
6 Saturday SATURDAY
7 Sunday SUNDAY
4 Thursday THURSDAY
2 Tuesday TUESDAY
3 Wednesday WEDNESDAY
IN
) <(join <(sort file2) <(sort file1) | sort -k2) | sort -k2n | sort -k1n | cut -d' ' -f 2-; } | column -t
长版本:
假设我们有两个文件file1
:
Dave 734.838.9800
Bob 313.123.4567
Carol 248.344.5576
Mary 313.449.1390
Ted 248.496.2204
Alice 616.556.4458
Jimmy 324.555.8867
Harry 422.858.2354
Lou 788.907.6859
和file2
:
Bob Tuesday
Carol Monday
Jimmy Wednesday
Ted Sunday
Alice Wednesday
Dave Thursday
Harry Monday
Mary Saturday
Lou Sunday
我们创建的file3
内容如下:
1 Monday MONDAY
2 Tuesday TUESDAY
3 Wednesday WEDNESDAY
4 Thursday THURSDAY
5 Friday FRIDAY
6 Saturday SATURDAY
7 Sunday SUNDAY
然后运行:
{ printf %s\\n "Name On-Call Phone"; \
join <(sort file2) <(sort file1) | sort -k2 | \
join -a1 -j2 -o 1.1 2.1 1.3 2.3 -e "Nobody" <(sort -k2 file3) - \
| sort -k1n | cut -d' ' -f 2-; } | column -t
或者,一行:
{ printf %s\\n "Name On-Call Phone"; join <(sort file2) <(sort file1) | sort -k2 | join -a1 -j2 -o 1.1 2.1 1.3 2.3 -e "Nobody" <(sort -k2 file3) - | sort -k1n | cut -d' ' -f 2-; } | column -t
输出:
Name On-Call Phone
Carol MONDAY 248.344.5576
Harry MONDAY 422.858.2354
Bob TUESDAY 313.123.4567
Alice WEDNESDAY 616.556.4458
Jimmy WEDNESDAY 324.555.8867
Dave THURSDAY 734.838.9800
Nobody FRIDAY Nobody
Mary SATURDAY 313.449.1390
Lou SUNDAY 788.907.6859
Ted SUNDAY 248.496.2204
它是如何工作的:
join <(sort file2) <(sort file1) | sort -k2
- 前两个文件根据第二个字段连接,然后输出按第二列排序:
Carol Monday 248.344.5576
Harry Monday 422.858.2354
Mary Saturday 313.449.1390
Ted Sunday 248.496.2204
Lou Sunday 788.907.6859
Dave Thursday 734.838.9800
Bob Tuesday 313.123.4567
Jimmy Wednesday 324.555.8867
Alice Wednesday 616.556.4458
这是通过管道join -a1 -j2 -o 1.1 2.1 1.3 2.3 -e "Nobody" <(sort -k2 file3) -
连接到file3
基于第二个字段的连接;-a1
将 file3 中不匹配的行添加到输出中,并将-e "Nobody"
缺少的输出字段替换为"Nobody"
:
5 Nobody FRIDAY Nobody
1 Carol MONDAY 248.344.5576
1 Harry MONDAY 422.858.2354
6 Mary SATURDAY 313.449.1390
7 Ted SUNDAY 248.496.2204
7 Lou SUNDAY 788.907.6859
4 Dave THURSDAY 734.838.9800
2 Bob TUESDAY 313.123.4567
3 Jimmy WEDNESDAY 324.555.8867
3 Alice WEDNESDAY 616.556.4458
结果再次通过管道传送到sort -k1n | cut -d' ' -f 2-
对第一个字段的输出进行数字排序,然后删除第一个字段:
Carol MONDAY 248.344.5576
Harry MONDAY 422.858.2354
Bob TUESDAY 313.123.4567
Alice WEDNESDAY 616.556.4458
Jimmy WEDNESDAY 324.555.8867
Dave THURSDAY 734.838.9800
Nobody FRIDAY Nobody
Mary SATURDAY 313.449.1390
Lou SUNDAY 788.907.6859
Ted SUNDAY 248.496.2204
{...}
由于它与printf %s\\n "Name On-Call Phone"
打印标题进行分组,因此整个输出将通过管道传输以column -t
美化它。
sort
如果 file3 已经在第二列上排序,您可以跳过一个(例如,这次使用简单的两列 file3):
5 Friday
1 Monday
6 Saturday
7 Sunday
4 Thursday
2 Tuesday
3 Wednesday
并分配一个电话号码,"Nobody"
例如sed 's/Nobody/888.000.8888/2'
:
{ printf %s\\n "Name On-Call Phone"; join <(sort file2) <(sort file1) | \
sort -k2 | join -a1 -j2 -o 1.1 2.1 1.2 2.3 -e "Nobody" file3 - | sort -k1n | \
cut -d' ' -f 2-; } | sed 's/Nobody/888.000.8888/2' | column -t
输出:
Name On-Call Phone
Carol Monday 248.344.5576
Harry Monday 422.858.2354
Bob Tuesday 313.123.4567
Alice Wednesday 616.556.4458
Jimmy Wednesday 324.555.8867
Dave Thursday 734.838.9800
Nobody Friday 888.000.8888
Mary Saturday 313.449.1390
Lou Sunday 788.907.6859
Ted Sunday 248.496.2204
答案2
awk '
BEGIN {
print "Name On-Call Phone"
split("MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY", days);
}
NR == FNR { day[$1] = $2; next }
{ lines[toupper(day[$1])] = $1 OFS toupper(day[$1]) OFS $2 }
END {
for (i=1; i<=7; i++) {
if (lines[days[i]])
print lines[days[i]]
else
print "Nobody", days[i]
}
}
' file2 file1 | column -t
Name On-Call Phone
Carol MONDAY 248.344.5576
Bob TUESDAY 313.123.4567
Alice WEDNESDAY 616.556.4458
Dave THURSDAY 734.838.9800
Nobody FRIDAY
Mary SATURDAY 313.449.1390
Ted SUNDAY 248.496.2204