使用 gawk 提取跟踪信息

使用 gawk 提取跟踪信息

我正在使用 gawk 从 mrt 文件中提取跟踪信息,以便进一步使用它进行分析。我已经成功地从 pcap 文件格式中提取跟踪信息,但无法找出 mrt 格式的跟踪信息。让我首先通过向您展示 pcap 格式的示例来解释我要提取的内容。

我的 pcap 输入文件是:

No.     Time        Source                Destination           Protocol Length User Datagram Protocol Info
  1 0.000000    2001:4958:10:2::2     2001:4958:10:2::3     BGP      143                           UPDATE Message
Frame 1: 143 bytes on wire (1144 bits), 143 bytes captured (1144 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 6, Src: 2001:4958:10:2::2 (2001:4958:10:2::2), Dst: 2001:4958:10:2::3 (2001:4958:10:2::3)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 56797 (56797), Seq: 1, Ack: 1, Len: 37
Border Gateway Protocol
No.     Time        Source                Destination           Protocol Length User Datagram Protocol Info
  2 0.326625    2001:4958:10:2::2     2001:4958:10:2::3     BGP      184                           UPDATE Message
Frame 2: 184 bytes on wire (1472 bits), 184 bytes captured (1472 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 6, Src: 2001:4958:10:2::2 (2001:4958:10:2::2), Dst: 2001:4958:10:2::3 (2001:4958:10:2::3)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 56797 (56797), Seq: 38, Ack: 1, Len: 78
Border Gateway Protocol
No.     Time        Source                Destination           Protocol Length User Datagram Protocol Info
  3 1.178114    2001:4958:10:2::2     2001:4958:10:2::3     TCP      106                           bgp > 56797 [ACK] Seq=116 Ack=20 Win=16384 Len=0 TSval=3269200636 TSecr=371929488
Frame 3: 106 bytes on wire (848 bits), 106 bytes captured (848 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 6, Src: 2001:4958:10:2::2 (2001:4958:10:2::2), Dst: 2001:4958:10:2::3 (2001:4958:10:2::3)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 56797 (56797), Seq: 116, Ack: 20, Len: 0
No.     Time        Source                Destination           Protocol Length User Datagram Protocol Info
  4 2.410144    64.251.87.209         64.251.87.210         BGP      228                           UPDATE Message, UPDATE Message
Frame 4: 228 bytes on wire (1824 bits), 228 bytes captured (1824 bits)
Ethernet II, Src: Cisco_e7:a1:c0 (00:1b:0d:e7:a1:c0), Dst: JuniperN_3e:ba:bd (78:19:f7:3e:ba:bd)
Internet Protocol Version 4, Src: 64.251.87.209 (64.251.87.209), Dst: 64.251.87.210 (64.251.87.210)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 65502 (65502), Seq: 1, Ack: 1, Len: 154
Border Gateway Protocol
Border Gateway Protocol
No.     Time        Source                Destination           Protocol Length User Datagram Protocol Info
  5 3.467853    206.47.102.206        206.47.102.201        BGP      105                           KEEPALIVE Message
Frame 5: 105 bytes on wire (840 bits), 105 bytes captured (840 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 4, Src: 206.47.102.206 (206.47.102.206), Dst: 206.47.102.201 (206.47.102.201)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 55700 (55700), Seq: 1, Ack: 1, Len: 19
Border Gateway Protocol

我想从文件中提取以下字段:

  • 时间
  • 来源
  • 目的地
  • 协议
  • 用户数据报长度

我创建了一个简单的脚本:

{

if($1 ~ /[0-9]/)
    {
        print $2,$3,$4,$5,$6
    }
}

并在输入跟踪上运行它以接收(gawk -f script.txt input-pcap.txt >> pcap-out.txt),如下所示:

0.000000 2001:4958:10:2::2 2001:4958:10:2::3 BGP 143
0.326625 2001:4958:10:2::2 2001:4958:10:2::3 BGP 184
1.178114 2001:4958:10:2::2 2001:4958:10:2::3 TCP 106
2.410144 64.251.87.209 64.251.87.210 BGP 228
3.467853 206.47.102.206 206.47.102.201 BGP 105

现在,我想对 mrt 输入格式做同样的事情。输入文件如下所示:

TIME: 11/01/07 00:11:09
TYPE: TABLE_DUMP/INET
VIEW: 0
SEQUENCE: 0
PREFIX: 0.0.0.0/0
FROM:96.4.0.55 AS11686
ORIGINATED: 10/24/07 06:26:23
ORIGIN: IGP
ASPATH: 11686 3561
NEXT_HOP: 96.4.0.55
STATUS: 0x1

TIME: 11/01/07 00:11:09TYPE: TABLE_DUMP/INETVIEW: 0SEQUENCE: 1PREFIX: 0.0.0.0/0
FROM:213.140.32.148 AS12956
ORIGINATED: 10/24/07 06:26:16
ORIGIN: IGP
ASPATH: 12956
NEXT_HOP: 213.140.32.148
STATUS:0x1

TIME: 11/01/07 00:11:09
TYPE: TABLE_DUMP/INET
VIEW: 0
SEQUENCE: 2
PREFIX: 3.0.0.0/8
FROM:207.45.223.244 AS6453
ORIGINATED: 10/31/07 07:37:39
ORIGIN: IGP
ASPATH: 6453 701 703 80
NEXT_HOP: 207.45.223.244
STATUS: 0x1

TIME: 11/01/07 00:11:09
TYPE: TABLE_DUMP/INET
VIEW: 0
SEQUENCE: 3
PREFIX: 3.0.0.0/8
FROM:195.219.96.239 AS6453
ORIGINATED: 10/31/07 07:49:07
ORIGIN: IGP
ASPATH: 6453 701 703 80
NEXT_HOP: 195.219.96.239
STATUS: 0x1

TIME: 11/01/07 00:11:09
TYPE: TABLE_DUMP/INET
VIEW: 0
SEQUENCE: 4
PREFIX: 3.0.0.0/8
FROM:129.250.0.11 AS2914
ORIGINATED: 10/31/07 06:09:07
ORIGIN: IGP
ASPATH: 2914 701 703 80
NEXT_HOP: 129.250.0.11
MULTI_EXIT_DISC: 6
COMMUNITY: 2914:420 2914:2000 2914:3000 65504:701
STATUS: 0x1   

我希望以与 pcap 格式相同的方式提取以下信息,以便分析软件可以读取:

  • 时间(与第一个数据包时间字段的差异,以秒为单位)
  • 发件人(第 6 字段,只需 IP 地址)
  • Next_hop(第 10 字段)
  • 协议/来源(第 8 字段)

这样输出看起来像:

0.0000 96.4.0.55 96.4.0.55 IGP
0.0000 213.140.32.148 213.140.32.148 IGP
0.0000 207.45.223.244 207.45.223.244 IGP
0.0000 195.219.96.239 195.219.96.239 IGP

答案1

gawk以下脚本 需要考虑以下几点:

  • 由于您的输入日期未显示世纪,因此您需要考虑该格式与date默认格式的关系。

  • 提供的示例输出显示小数时间,但您的描述显示秒。数据以秒为单位,因此此脚本输出整数秒


gawk 'function d2s( date ){ c="date -d\""date"\" +%s"; c|getline b; close(c); return gensub("\n","","",b) }
      BEGIN{ RS="\n *\n" }
      { 
        nf=split($0,f,"\n"); delete o
        for(fi=1;fi<=nf;fi++){
            split(f[fi],s," ")
            if( s[1]=="TIME:"     ) o[1]=s[2]" "s[3]
            if( s[1]=="FROM:"     ) o[2]=s[2]      
            if( s[1]=="NEXT_HOP:" ) o[3]=s[2]      
            if( s[1]=="ORIGIN:"   ) o[4]=s[2]      
        }
        if( o[1] ){
            if( fps ) tds=d2s(o[1])-fps
            else { tds=0; fps=d2s(o[1]) }
            printf( "%s %s %s %s\n", tds, o[2], o[3], o[4] ) 
        } 
      }' "$f"

输出 - 输入时间与发布的时间不同(它们都是相同的):

0 96.4.0.55 96.4.0.55 IGP
1 213.140.32.148 213.140.32.148 IGP
3 207.45.223.244 207.45.223.244 IGP
7 195.219.96.239 195.219.96.239 IGP
15 129.250.0.11 129.250.0.11 IGP

相关内容