我想要按如下方式拆分数据:
$1 $2 $3 $4 $5 $6 $7 $8 .........
---------------------------------------------------------------------------------------------------
root tty5 Wed Dec 18 13:42:28 2019 still logged in
~~~~~~~~~~~~~ ~~~
^ ^
root tty5 Wed Dec 18 11:23:20 2019 - Wed Dec 18 11:24:47 2019 (00:01)
john pts/2 xx.xxx.xx.xxx Tue Sep 3 10:11:31 2019 - Tue Sep 3 10:21:18 2019 (00:09)
john pts/3 xx.xxx.xx.xxx Mon Sep 2 14:42:29 2019 - Mon Sep 2 14:57:33 2019 (00:15)
john pts/2 xx.xxx.xx.xxx Mon Sep 2 14:40:03 2019 - Mon Sep 2 14:45:27 2019 (00:05)
john pts/2 xx.xxx.xx.xxx Mon Sep 2 13:52:09 2019 - Mon Sep 2 14:34:12 2019 (00:42)
john pts/3 xx.xxx.xx.xxx Mon Sep 2 13:14:39 2019 - Mon Sep 2 14:03:24 2019 (00:48)
john pts/2 xx.xxx.xx.xxx Mon Sep 2 13:08:11 2019 - Mon Sep 2 13:23:16 2019 (00:15)
john pts/2 xx.xxx.xx.xxx Mon Sep 2 10:22:27 2019 - Mon Sep 2 11:10:48 2019 (00:48)
john pts/2 xx.xxx.xx.xxx Fri Aug 30 17:25:19 2019 - Fri Aug 30 17:33:34 2019 (00:08)
john pts/2 xx.xxx.xx.xxx Wed Aug 28 10:43:56 2019 - Wed Aug 28 10:52:48 2019 (00:08)
john pts/2 xx.xxx.xx.xxx Tue Aug 27 16:59:30 2019 - Tue Aug 27 17:52:50 2019 (00:53)
john pts/2 xx.xxx.xx.xxx Tue Aug 6 11:06:46 2019 - Tue Aug 6 11:12:05 2019 (00:05)
john pts/2 xx.xxx.xx.xxx Tue Aug 6 10:48:39 2019 - Tue Aug 6 11:01:46 2019 (00:13)
john pts/2 xx.xxx.xx.xxx Tue Aug 6 10:38:18 2019 - Tue Aug 6 10:43:18 2019 (00:05)
john pts/2 xx.xxx.xx.xxx Tue Aug 6 10:28:02 2019 - Tue Aug 6 10:36:04 2019 (00:08)
john pts/2 xx.xxx.xx.xxx Fri Aug 2 14:24:00 2019 - Fri Aug 2 14:24:16 2019 (00:00)
root tty5 Fri Aug 2 14:21:30 2019 - Fri Nov 22 11:03:20 2019 (111+20:41)
root tty5 Fri Jul 26 11:02:17 2019 - Fri Jul 26 11:03:58 2019 (00:01)
john pts/3 xx.xxx.xx.xxx Thu Jul 25 16:24:44 2019 - Thu Jul 25 16:33:36 2019 (00:08)
john pts/2 xx.xxx.xx.xxx Thu Jul 25 16:08:41 2019 - Thu Jul 25 16:33:53 2019 (00:25)
但是,如果 $3 为空,我就无法正确获取 $3 和下一个字段的值。例如:
$ last -F | grep -E 'tty|pty|pts' | awk '{print $3}'
Wed <- not correct
Wed <- not correct
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
xx.xxx.xx.xxx
Fri <- not correct
Fri <- not correct
xx.xxx.xx.xxx
xx.xxx.xx.xxx
如何使用 awk 或类似的命令行工具正确解析它?
答案1
在这个特殊的情况下,让我们使用这个过滤器:
awk '{if ($4 !~ /^(Mon|Tue|Wed|Thu|Fri|Sat|Sun)$/) $3="- "$3; print}'
如果一切正常,则$4
是Mon
或Tue
或Wed
或…空则$3
包含$4
或Jan
或Feb
或Mar
…
我们检测到了这一点。如果结果与$4
我们的预期不符,我们会注入一个额外的字段来$3
移动字段。改变后的输出不再是列式的,它看起来像这样(片段):
root tty5 - Wed Dec 18 13:42:28 2019 still logged in
root tty5 - Wed Dec 18 11:23:20 2019 - Wed Dec 18 11:24:47 2019 (00:01)
john pts/2 xx.xxx.xx.xxx Tue Sep 3 10:11:31 2019 - Tue Sep 3 10:21:18 2019 (00:09)
john pts/3 xx.xxx.xx.xxx Mon Sep 2 14:42:29 2019 - Mon Sep 2 14:57:33 2019 (00:15)
为了验证其有效性,我们可以将结果传输至column -t
:
root tty5 - Wed Dec 18 13:42:28 2019 still logged in
root tty5 - Wed Dec 18 11:23:20 2019 - Wed Dec 18 11:24:47 2019 (00:01)
john pts/2 xx.xxx.xx.xxx Tue Sep 3 10:11:31 2019 - Tue Sep 3 10:21:18 2019 (00:09)
john pts/3 xx.xxx.xx.xxx Mon Sep 2 14:42:29 2019 - Mon Sep 2 14:57:33 2019 (00:15)
但您不需要column
进一步可靠地解析。
笔记:
你
last
使用了英文缩写;我的也是,尽管我的 Kubuntu 大部分内容都已本地化。我不知道是否有本地化版本last
,但如果有,可以通过指定C
语言环境强制它使用英语:LC_ALL=C last -F | …
有一列包含可能的值
still
和-
。这样您就可以轻松检测到still logged in
。