将格式化日期转换为自纪元以来的秒数

将格式化日期转换为自纪元以来的秒数

我有一个文件:

pablo tty8 Thu Nov 1 12:51:21 2012 still logged in 
(unknown tty8 Thu Nov 1 12:50:57 2012 - Thu Nov 1 12:51:21 2012 (00:00) 
pablo tty2 Thu Nov 1 12:50:39 2012 still logged in 
pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) 
(unknown tty7 Thu Nov 1 12:34:32 2012 - Thu Nov 1 12:49:45 2012 (00:15)

我想暂时替换上述日期的文件。我想打印:

pablo tty8 1351770681 still logged in 
(unknown tty8 1351770657 - 1351770681 (00:00) 
pablo tty2 1351770639 still logged in 
pablo tty7 1351770585 - 1351770656 (00:01) 
(unknown tty7 1351769672 - 1351770585 (00:15)

我尝试了这个命令:

gawk --posix 'function my()
{"date -d \047"$0"\047 +%s" | getline b; 
gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b );print}
{ my() }' file

上面的命令不起作用:

$ gawk --posix 'function my()
> {"date -d \047"$0"\047 +%s" | getline b; 
> gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b ); print}
> { my() }' ta
date: błędna data: `pablo tty8 Thu Nov 1 12:51:21 2012 still logged in '
pablo tty8  still logged in 
(unknown tty8 1351897200 - 1351897200 (00:00) 
date: błędna data: `pablo tty2 Thu Nov 1 12:50:39 2012 still logged in '
pablo tty2 1351897200 still logged in 
date: błędna data: `pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) '
pablo tty7 1351897200 - 1351897200 (00:01) 
(unknown tty7 1351897200 - 1351897200 (00:15)

如何改进上述命令?

答案1

这是另一种方法(使用mktime):

#!/bin/awk -f
{
    split($6,A,":");
    S1=sprintf("%d %d %d %d %d %d",$7,$4,$5,A[1],A[2],A[3])
    T1=mktime(S1)
    if ($8=="-") {
        split($12,A,":");
        S2=sprintf("%d %d %d %d %d %d",$13,$10,$11,A[1],A[2],A[3])
        T2=mktime(S2)
        print $1,$2,T1,$8,T2,$14
    }
    else {
        print $1,$2,T1,$8,$9,$10
    }
}

答案2

要按照你的方式去做,那必须是这样的:

POSIXLY_CORRECT=1 awk '
  {
    n = ""; r = $0
    while (match(r, /[[:alpha:]]{3} [[:alpha:]]{3} +[0-9]+ ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/)) {
      c = "date -d\"" substr(r,RSTART,RLENGTH) "\" +%s"
      c | getline b
      close(c)
      n = n substr(r,1,RSTART-1) b
      r =  substr(r,RSTART+RLENGTH)
    }
    print n r
  }'

答案3

你可以使用 GNU sed 这样做:

转换日期.sed

: a
s/(([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4})(.*)/\n\4\n\1/
h
s/.*\n//
s/^/date -d "/
s/$/" +%s/e
G
s/([^\n]+)\n([^\n]+)\n([^\n]+)\n.*/\2\1\3/
/([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/ta

像这样运行它:

sed -rf convert_date.sed infile

输出:

pablo tty8 1351770681 still logged in 
(unknown tty8 1351770657 - 1351770681 (00:00) 
pablo tty2 1351770639 still logged in 
pablo tty7 1351770585 - 1351770656 (00:01) 
(unknown tty7 1351769672 - 1351770585 (00:15)

解释

乍一看这可能有点令人畏惧,但这个想法并不那么复杂。此正则表达式([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4}出现在第一个替换和末尾的条件中,与输入中使用的日期类型匹配,它捕获并隔离日期。当date -d在捕获的日期上运行时,周围的位存储在保留空间中。最后,所有位都收集在模式空间中并重新组织成正确的顺序。

如果模式空间中仍有任何日期,则末尾的条件将重复该过程。

答案4

Stephane 提供的 Perl 解决方案需要非核心 Perl 模块。可以使用核心模块(自 5.10 起),时间::件, 相似地:

#!/usr/bin/env perl
use strict;
use warnings;
use Time::Piece;
my $t = Time::Piece->new;
while (<>) {
    s{\w{3}\s(\w{3}\s\d{1,2}\s\d\d:\d\d:\d\d\s\d{4})}
        {$t=Time::Piece->strptime($1,"%b %d %H:%M:%S %Y");
        sprintf "%s",$t->epoch}ge;
    print;
}

相关内容