找出同一变量的最长时间与最短时间的差值?

找出同一变量的最长时间与最短时间的差值?

我有如下所示的 fb.csv 文件。

"Source","Time"  
"192.168.137.174","12:26:25"
"10.0.138.163","12:26:25"
"157.240.10.13","12:26:36"
"157.240.10.13","12:26:36"
"157.240.10.23","12:26:41"
"157.240.10.23","12:26:41"
"10.0.138.163","12:26:52"
"192.168.137.174","12:26:52"
"157.240.10.18","12:26:52"
"157.240.10.18","12:26:52"
"157.240.10.23","12:26:53"
"157.240.10.23","12:26:53"
"192.168.137.174","12:27:02"
"10.0.138.163","12:27:02"
"192.168.137.174","12:27:07"

我想找出同一“源”的最长时间与最短时间之间的差值。

期望输出;

"Source","Duration Time"  
"192.168.137.174","00:01:22"
"10.0.138.163","00:01:17"
"157.240.10.13","00:00:00"
"157.240.10.23","00:00:00"
"157.240.10.18","00:00:00"

有什么方法吗?谢谢

答案1

又是我,那个用很长的awk单行命令的人...这个甚至更长:

awk -F, 'BEGIN{print"\"Source\",\"Duration Time\""}NR>1{gsub(/"/,"",$2);split($2,hms,":");s=hms[1]*3600+hms[2]*60+hms[3];if(!(($1,"MAX")in a)||a[$1,"MAX"]<s)a[$1,"MAX"]=s;if(!(($1,"MIN")in a)||a[$1,"MIN"]>s)a[$1,"MIN"]=s}END{for(idx in a){split(idx,ipm,SUBSEP);if(ipm[2]=="MAX"){d=a[idx]-a[ipm[1],"MIN"];h=int(d/3600);m=int((d-h*3600)/60);s=d%60;printf("%s,\"%02d:%02d:%02d\"\n",ipm[1],h,m,s)}}}' fb.csv 

使用fb.csv问题中给出的输入文件,输出如下所示:

"Source","Duration Time"
"157.240.10.23","00:00:12"
"157.240.10.18","00:00:00"
"157.240.10.13","00:00:00"
"10.0.138.163","00:00:37"
"192.168.137.174","00:00:42"

命令解释:

我们在这里像这样运行awk,设置分隔列的字段分隔符,并使用文件fb.csv作为输入:

awk -F, '<COMMAND>' fb.csv

经过正确格式化后,命令awk<COMMAND>上面的占位符)如下:

BEGIN {
    print "\"Source\",\"Duration Time\""
}
NR>1 {
    gsub(/"/, "", $2)
    split($2, hms, ":")
    s = hms[1]*3600 + hms[2]*60 + hms[3]
    if ( !(($1,"MAX") in a) || a[$1,"MAX"] < s )
        a[$1,"MAX"] = s
    if ( !(($1,"MIN") in a) || a[$1,"MIN"] > s )
        a[$1,"MIN"] = s
}
END {
    for (idx in a) {
        split(idx, ipm, SUBSEP)
        if (ipm[2]=="MAX") {
            d = a[idx] - a[ipm[1],"MIN"]
            h = int(d / 3600)
            m = int((d - h * 3600) / 60)
            s = d%60
            printf("%s,\"%02d:%02d:%02d\"\n", ipm[1] ,h ,m ,s)
        }
    }
}
  • BEGIN块只是打印新的 CSV 标题。

  • NR>1块在输入文件中每行运行一次,第一行除外,因为第一行包含标题。每行被分成 IP 列 ( $1) 和时间列 ( $2)。

    我们通过删除引号gsub并在冒号处将其拆分hms为包含小时、分钟和秒的数组来处理时间列。这用于将时间戳转换为自午夜以来的秒数,并存储在s此块中。

    接下来,我们检查关联数组是否尚未包含具有当前行 IP 的条目,或者该条目是否具有较小的 MAX 或较大的 MIN 时间值,在这种情况下将进行相应的更新。

  • 最后,在END块中评估创建的数组,并针对其中的每个 IP,计算 MAX 和 MIN 时间戳之间的差异并将其保存为d。这会将其转换回小时、分钟和秒并以正确的格式输出。

相关内容