不同文件之间的时间差计算

不同文件之间的时间差计算

我有一个非常大的日志文件,需要调用开始和结束时间:

Bigfile.txt:

2021-02-24 14:21:34,630;START
2021-02-24 14:21:35,529;END  
2021-02-24 14:57:05,600;START
2021-02-24 14:57:06,928;END  
2021-02-24 15:46:45,894;START
2021-02-24 15:46:46,762;END  
2021-02-24 17:49:20,925;START
2021-02-24 17:49:26,243;END  
2021-02-24 18:32:18,166;START
2021-02-24 18:32:18,969;END  

我需要以这种格式创建第三个文件(由 3 列组成:START(大文件的第 1 行)、END(大文件的第 2 行);持续时间(以秒为单位报告的差异):

Outputfile.txt:

2021-02-24 14:21:34,630;2021-02-24 14:21:35,529;0,899
2021-02-24 14:57:05,600;2021-02-24 14:57:06,928;1,328

对于整个文件。有人可以帮助我吗?我如何通过 bash 脚本设置这项工作?如果有人也能给我解释一下:D

预先感谢您的每一份支持。

答案1

我对 GNU 提出以下建议awk

awk -F'[,;]' \
  # odd lines (START)
  'NR%2 == 1 {
    # set a to date an miliseconds
    a = $1","$2
    # set d1 to date replacing - and : for spaces
    d1 = gensub(/[-:]/," ","g",$1)
    # set m1 to miliseconds
    m1 = $2 
  } 
  # even lines (END)
  NR%2 == 0 {
    OFS=";"
    # the same as before...
    b = $1","$2
    d2 = gensub(/[-:]/," ","g",$1)
    m2 = $2
    # set c and d to seconds and miliseconds
    c = mktime(d2)"."m2
    d = mktime(d1)"."m1
    # print
    print a, b, c-d
  }' file

输出:

2021-02-24 14:21:34,630;2021-02-24 14:21:35,529;0.899
2021-02-24 14:57:05,600;2021-02-24 14:57:06,928;1.328
2021-02-24 15:46:45,894;2021-02-24 15:46:46,762;0.868
2021-02-24 17:49:20,925;2021-02-24 17:49:26,243;5.318
2021-02-24 18:32:18,166;2021-02-24 18:32:18,969;0.803

答案2

一个bash办法。你需要让它bc发挥作用。

#!/bin/bash

# For each line of the file
while read line; do
    if [ "${line#*;}" = "START" ]; then
        # We are in a start record
        # Save the whole line less the START token 
        whole_start="${line%;*}"

        # Get the start time as : sec,ms
        start_time="${whole_start##*:}"
        
        # swap , with . for bc
        start_time="${start_time/,/.}"

        # ... And go to the next round
        continue

    else
        # We are in the END round, do the same
        whole_end="${line%;*}"
        end_time="${whole_end##*:}"
        end_time="${end_time/,/.}"
    fi

    # Obtain the ms and add the leading 0 if it miss
    ms_diff=`echo "scale=3; $end_time - $start_time" | bc | sed '/^\./ s/.*$/0&/'`

    echo "$whole_start;$whole_end;$ms_diff"
done < ./your_file.csv

将输出重定向到文件,您将得到您想要的内容:

2021-02-24 14:21:34,630;2021-02-24 14:21:35,529;0.899
2021-02-24 14:57:05,600;2021-02-24 14:57:06,928;1.328
2021-02-24 15:46:45,894;2021-02-24 15:46:46,762;0.868
2021-02-24 17:49:20,925;2021-02-24 17:49:26,243;5.318
2021-02-24 18:32:18,166;2021-02-24 18:32:18,969;0.803

相关内容