最小/最大/平均交易持续时间的计算缺少输出中最短交易的 ID

最小/最大/平均交易持续时间的计算缺少输出中最短交易的 ID

我有一个日志文件,必须使用 unix 命令对其进行解析。

我需要计算行之间的时间差,最后我需要显示事务之间的 MIN、MAX 和 AVG 时间以及 MIN 的 ID 号。

我的脚本正在执行我编写的所有内容,接受 MIN 的 ID 号,但我不明白为什么。

  • 日志文件示例:
    03/22 08:51:01.050 INFO :1000 :.main: *************** RSVP Agent started ***************
    03/22 08:51:01.532 INFO :1001 :...locate_configFile: Specified configuration file: /u/user10/rsvpd1.conf WARNING
    03/22 08:51:01.405 INFO :1002 :.main: Using log level 511
    03/22 08:51:01.970 INFO :1003 :..settcpimage: Get TCP images rc - EDC8112I Operation not supported on socket.
    03/22 08:51:01.837 INFO :1004 :..settcpimage: Associate with TCP/IP image name = TCPCS
    03/22 08:51:02.100 INFO :1005 :..reg_process: registering WARNING process with the system
    03/22 08:51:02.524 INFO :1006 :..reg_process: attempt OS/390 registration
    03/22 08:51:02.748 INFO :1007 :..reg_process: return from registration rc=0
    03/22 08:51:06.624 TRACE :1008 :.....starting_transaction: calling API: status: START
    03/22 08:51:06.123 INFO :1009 :...read_physical_netif: index #0, interface VLINK1 has address 129.1.1.1, ifidx 0
    03/22 08:51:06.524 INFO :1010 :...read_physical_netif: index #1, interface TR1 has address 9.37.65.139, ifidx 1
    03/22 08:51:06.367 INFO :1011 :...read_physical_netif: index #2, interface LINK11 has address 9.67.100.1, ifidx 2
    03/22 08:51:06.748 INFO :1012 :...read_physical_netif: index #3, interface LINK12 has address 9.67.101.1, ifidx 3
    03/22 08:51:06.965 INFO :1013 :...read_physical_netif: index #4, interface CTCD0 has address 9.67.116.98, ifidx 4
    03/22 08:51:06.010 INFO :1014 :...read_physical_netif: index #5, interface CTCD2 has address 9.67.117.98, ifidx 5
    03/22 08:51:06.050 INFO :1015 :...read_physical_netif: index #6, interface LOOPBACK has address 127.0.0.1, ifidx 0
    03/22 08:51:06.100 INFO :1016 :....mailslot_create: creating mailslot for timer
    03/22 08:51:06.724 INFO :1017 :.....ending_transaction: calling API: status: END
    03/22 08:51:06.970 INFO :1018 :.....mailslot_create: creating mailslot for RSVP
    03/22 08:51:06.160 INFO :1019 :....mailbox_register: mailbox allocated for rsvp
    
  • 我的脚本:
    for i in log-file.txt
    do
      cat log-file.txt | grep -E "starting_transaction|ending_transaction" >> transactions.txt | awk '{print $2}' <transactions.txt >global-time.txt
    
      awk -F: '{ print ($1 * 3600) + ($2 * 60) + $3 }' <global-time.txt >seconds-time.txt
    
      awk 'NR > 1 { print $0 - prev } { prev = $0 }' <seconds-time.txt >difference-time.txt
    
      awk '{print $4}' <transactions.txt >trans-id.txt | paste difference-time.txt trans-id.txt > diff-transid.txt
    
      awk '{if(min==""){min=max=$1 $2}; if($1>max) {max=$1 $2}; if($1<min) {min=$1 $2}; total+=$1; count+=1} END {print "avg " total/count," | max " max," | min " min " | minID " $2}' <diff-transid.txt >final-answer.txt
    
    done
    
  • 我得到的结果:
    avg 11.1467  | max 99.1  | min 0.1 | minID
    
  • 我需要的结果:
    avg 11.1467  | max 99.1  | min 0.1 | minID 1017
    

答案1

您想要实现的目标可以完全在awk脚本中实现,这比使用 shell 循环进行文本处理要高效得多。我会推荐以下程序(我们称之为analyze_timing.awk):

#!/usr/bin/awk -f

function timediff(start,end,    stfld,endfld,diff) {
    split(start,stfld, /:/)
    split(end,  endfld,/:/)

    if (endfld[1]<stfld[1]) {
        diff=(3600*(endfld[1]+24) + 60*endfld[2] + endfld[3])
    }
    else {
        diff=(3600*endfld[1] + 60*endfld[2] + endfld[3])
    }

    diff -= (3600*stfld[1] + 60*stfld[2] + stfld[3])
    return diff
}


$5 ~ /^:\.+starting_transaction/ {laststart=$2;next}

$5 ~ /^:\.+ending_transaction/ {
    n_transact++
    duration=timediff(laststart, $2)
    avg+=duration
    
    if (n_transact==1) {
        shortest=duration
        longest=duration
        min_id=substr($4,2)
    }
    else {
        if (duration<shortest) {
            shortest=duration
            min_id=substr($4,2)
        } else if (duration>longest) {
            longest=duration
        }
    }
}

END {
    printf("avg: %f | max: %f | min: %f | minID: %d\n", avg/n_transact, longest, shortest, min_id)
}

这将首先定义一个函数timediff()来计算两个时间戳之间经过的时间,如示例所示。为简单起见,假设一笔交易需要不到 24 小时。

然后,它将检查一行的第 5 个字段是否以starting_transactiona:和任意数量的开头,.并将时间记录在变量中laststart。如果第五个字段同样以 开头ending_transaction,它将计算差异laststart并填充用于计算最小/最大/平均值的变量。如果是迄今为止最短的交易,则 ID 将记录在 中min_id

最后,程序根据需要打印摘要。

你会称其为

awk -f analyze_timing.awk log-file.txt

相关内容