我有一个文本文件,其中包含有关网络服务器的日志信息。
日期格式:日-月-年
样本内容:
/tmp/archive/9-10-2020/error_04.log.gz
/tmp/archive/9-10-2020/error_05.log.gz
/tmp/archive/9-7-2020/access_01.log.gz
/tmp/archive/9-7-2020/access_02.log.gz
/tmp/archive/9-7-2020/access_03.log.gz
/tmp/archive/9-7-2020/error_03.log.gz
/tmp/archive/9-7-2020/error_04.log.gz
/tmp/archive/9-7-2020/error_05.log.gz
/tmp/archive/9-8-2020/error_01.log.gz
/tmp/archive/9-8-2020/error_02.log.gz
/tmp/archive/9-8-2020/error_03.log.gz
/tmp/archive/9-8-2020/error_04.log.gz
/tmp/archive/9-8-2020/error_05.log.gz
/tmp/archive/9-9-2020/access_01.log.gz
/tmp/archive/9-9-2020/access_02.log.gz
/tmp/archive/9-9-2020/access_03.log.gz
我想根据日期顺序列出此内容(第三列)。我尝试了排序命令,它没有给出按日期排序。
预期输出:
/tmp/archive/9-7-2020/access_01.log.gz
/tmp/archive/9-7-2020/access_02.log.gz
/tmp/archive/9-7-2020/access_03.log.gz
/tmp/archive/9-7-2020/error_03.log.gz
/tmp/archive/9-7-2020/error_04.log.gz
/tmp/archive/9-7-2020/error_05.log.gz
/tmp/archive/9-8-2020/error_01.log.gz
/tmp/archive/9-8-2020/error_02.log.gz
/tmp/archive/9-8-2020/error_03.log.gz
/tmp/archive/9-8-2020/error_04.log.gz
/tmp/archive/9-8-2020/error_05.log.gz
/tmp/archive/9-9-2020/access_01.log.gz
/tmp/archive/9-9-2020/access_02.log.gz
/tmp/archive/9-9-2020/access_03.log.gz
/tmp/archive/9-10-2020/error_04.log.gz
/tmp/archive/9-10-2020/error_05.log.gz
更新:
Sort syntax:
sort -k4.7,4.11 -k4,5
/tmp/backup/7-12-2020/access_04.log
/tmp/backup/7-12-2020/error_02.log
/tmp/backup/7-12-2020/error_03.log
/tmp/backup/7-12-2020/error_04.log
/tmp/backup/7-12-2020/error_05.log
/tmp/backup/8-11-2020/access_01.log
/tmp/backup/8-11-2020/access_02.log
/tmp/backup/8-12-2020/error_01.log
/tmp/backup/8-12-2020/error_02.log
/tmp/backup/8-12-2020/error_03.log
/tmp/backup/8-12-2020/error_04.log
/tmp/backup/8-12-2020/error_05.log
/tmp/backup/9-11-2020/access_01.log
/tmp/backup/9-11-2020/access_02.log
/tmp/backup/9-11-2020/access_03.log
/tmp/backup/9-11-2020/access_04.log
答案1
对于这样的特定模式,您可以将路径名拆分为单独的/
组件-
,并将它们放在行的开头,
awk '{
split($0, f, "[/-]");
printf "%04d-%02d-%02d\t%s\t%s\n", f[6], f[5], f[4], f[7], $0
}'
然后相应地对日期 ( yyyy-mm-dd
) 和文件名 (eg access_NN.log.gz
)进行排序
sort
最后剥离排序组件
cut -f3-
假设示例数据位于文件中,/tmp/logs
您可以将其放在一起,如下所示
awk '{ split($0, f, "[/-]"); printf "%04d-%02d-%02d\t%s\t%s\n", f[6], f[5], f[4], f[7], $0 }' /tmp/logs |
sort |
cut -f3-
/tmp/archive/9-7-2020/access_01.log.gz
/tmp/archive/9-7-2020/access_02.log.gz
/tmp/archive/9-7-2020/access_03.log.gz
/tmp/archive/9-7-2020/error_03.log.gz
/tmp/archive/9-7-2020/error_04.log.gz
/tmp/archive/9-7-2020/error_05.log.gz
/tmp/archive/9-8-2020/error_01.log.gz
/tmp/archive/9-8-2020/error_02.log.gz
/tmp/archive/9-8-2020/error_03.log.gz
/tmp/archive/9-8-2020/error_04.log.gz
/tmp/archive/9-8-2020/error_05.log.gz
/tmp/archive/9-9-2020/access_01.log.gz
/tmp/archive/9-9-2020/access_02.log.gz
/tmp/archive/9-9-2020/access_03.log.gz
/tmp/archive/9-10-2020/error_04.log.gz
/tmp/archive/9-10-2020/error_05.log.gz
答案2
在这里,假设所有行始终以/tmp/archive/
或 相同长度开头,您可以这样做:
sort -t- -k3,3.4n -k1.14,1n -k2,2n -k3.5
在这里你可以将其简化为:
sort -t- -k3n -k1.14n -k2n -k3.5
as-
在任何语言环境中都不会是千位分隔符(因为它也是负号的字符),因此对于标志n
,关键规范(例如-k1.14n
将在第一行选择)或(仅选择)都会产生7-12-2020/access_04.log
-k1.14,1n
7
7
数字。