多播(PIM-SM)导致 Cisco WS-4900M 的 CPU 使用率过高

多播(PIM-SM)导致 Cisco WS-4900M 的 CPU 使用率过高

晚上好,我的网络在骨干网络上有几台 Cisco 路由器(WS-4900M 和 7606(作为主核心)),不同的 l3 交换机作为客户端接入路由器连接到它们,用于接收 IPTV 多播流量的 igmp 消息。这是一个简单的方案,采用 OSPF+PIM-SM,以 7606 作为集合点(通过 MSDP+MBGP 直接连接多播源)。[网络方案][1]

因此,所有客户端都接收多播流量,但在一个路由器(Core2,见方案)上,我的 CPU 使用率非常高(晚上为 80-90%)(Cat4k Mgmt LoPri),其他的则不到 10%。当我启动调试平台数据包所有缓冲区时,我收到了类似以下消息

 Index 2:
54 days 15:56:12:890689 - RxVlan: 1013, RxPort: Te1/8
Priority: High, Tag: No Tag, Event: L3 Forward, Flags: 0x40, Size: 1362
Eth: Src 68:EF:BD:B5:F1:BF Dst 01:00:5E:7F:05:6A Type/Len 0x0800
Ip: ver:IpVersion4 len:20 tos:192 totLen:1344 id:0 fragOffset:0 ttl:19 proto:udp
    src: 172.16.255.2 dst: 239.255.5.106 firstFragment lastFragment

显示 ip mroute Core2

(*, 239.255.5.106), 01:51:13/00:03:27, RP 172.23.176.252, flags: S
  Incoming interface: TenGigabitEthernet1/8, RPF nbr 172.23.176.69
  Outgoing interface list:
    TenGigabitEthernet1/2, Forward/Sparse, 01:51:13/00:03:27

显示 ip mroute Core5

(*, 239.255.5.106), 01:51:57/00:03:25, RP 172.23.176.252, flags: S
  Incoming interface: Vlan20, RPF nbr 172.23.180.85
  Outgoing interface list:
    TenGigabitEthernet1/8, Forward/Sparse, 01:51:57/00:02:40, H

显示 ip mroute Core1

    (*, 239.255.5.106), 06:11:51/00:03:08, RP 172.23.176.252, flags: S
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    Vlan20, Forward/Sparse, 01:52:44/00:03:08

(172.16.255.2, 239.255.5.106), 06:11:51/00:01:36, flags: MT
  Incoming interface: Vlan1532, RPF nbr 10.217.0.130
  Outgoing interface list:
    Vlan20, Forward/Sparse, 01:52:44/00:03:08

因此,正如您所看到的,由于某种原因,我的缓冲区中有很多多播,但没有 (S,G) 条目

然后我尝试启动 debug ip pim 并看到 2 种情况,其中有 (S,G) 条目

   PIM(0): Received v2 Join/Prune on TenGigabitEthernet1/4 from 172.23.177.114, to us
    PIM(0): Join-list: (*, 239.255.2.19), RPT-bit set, WC-bit set, S-bit set
    PIM(0): Update TenGigabitEthernet1/4/172.23.177.114 to (*, 239.255.2.19), Forward state, by PIM *G Join
    PIM(0): Update TenGigabitEthernet1/4/172.23.177.114 to (172.16.2.250, 239.255.2.19), Forward state, by PIM *G Join
    PIM(0): Received v2 Join/Prune on TenGigabitEthernet1/4 from 172.23.177.114, to us
    PIM(0): Join-list: (*, 239.255.2.19), RPT-bit set, WC-bit set, S-bit set
    PIM(0): Update TenGigabitEthernet1/4/172.23.177.114 to (*, 239.255.2.19), Forward state, by PIM *G Join
    PIM(0): Update TenGigabitEthernet1/4/172.23.177.114 to (172.16.2.250, 239.255.2.19), Forward state, by PIM *G Join
    PIM(0): Received v2 Join/Prune on TenGigabitEthernet1/4 from 172.23.177.114, to us
    PIM(0): Join-list: (172.16.2.250/32, 239.255.2.19), S-bit set
    PIM(0): Update TenGigabitEthernet1/4/172.23.177.114 to (172.16.2.250, 239.255.2.19), Forward state, by PIM SG Join
    PIM(0): Received v2 Join/Prune on TenGigabitEthernet1/4 from 172.23.177.114, to us
    PIM(0): Join-list: (172.16.2.250/32, 239.255.2.19), S-bit set
    PIM(0): Update TenGigabitEthernet1/4/172.23.177.114 to (172.16.2.250, 239.255.2.19), Forward state, by PIM SG Join

无(S,G)条目

PIM(0): Received v2 Join/Prune on TenGigabitEthernet1/2 from 172.23.178.130, to us
PIM(0): Join-list: (*, 239.255.40.232), RPT-bit set, WC-bit set, S-bit set
PIM(0): Check RP 172.23.176.252 into the (*, 239.255.40.232) entry
PIM(0): Building Triggered (*,G) Join / (S,G,RP-bit) Prune message for 239.255.40.232
PIM(0): Add TenGigabitEthernet1/2/172.23.178.130 to (*, 239.255.40.232), Forward state, by PIM *G Join
PIM(0): Building Triggered (*,G) Join / (S,G,RP-bit) Prune message for 239.255.40.232
PIM(0): Insert (*,239.255.40.232) join in nbr 172.23.176.69's queue
PIM(0): Received v2 Join/Prune on TenGigabitEthernet1/2 from 172.23.178.130, to us
PIM(0): Join-list: (*, 239.255.40.232), RPT-bit set, WC-bit set, S-bit set
PIM(0): Update TenGigabitEthernet1/2/172.23.178.130 to (*, 239.255.40.232), Forward state, by PIM *G Join
PIM(0): Building Join/Prune packet for nbr 172.23.176.69
PIM(0):  Adding v2 (172.23.176.252/32, 239.255.40.232), WC-bit, RPT-bit, S-bit Join
PIM(0): Send v2 join/prune to 172.23.176.69 (TenGigabitEthernet1/8)

您对此有什么想法吗?

显示已排序的 proc cpu

CPU utilization for five seconds: 44%/4%; one minute: 43%; five minutes: 44%
 PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process 
  85  1833127889  1262153216       1452 32.63% 32.41% 32.89%   0 Cat4k Mgmt LoPri 
 149   237386465  3486312836          0  3.91%  3.91%  3.97%   0 IP Input         
  84    64265071   309949699        207  2.39%  2.30%  2.35%   0 Cat4k Mgmt HiPri 
  94     1932693      236735       8163  0.55%  0.06%  0.00%   0 Per-minute Jobs  
 316      527359     1146186        460  0.15%  0.02%  0.00%   0 SNMP ENGINE      
 327     2415007    33093035         72  0.15%  0.09%  0.07%   0 OSPF-1 Router    
 160          44     4724582          0  0.07%  0.00%  0.00%   0 Socket Timers    
 158         184       45746          4  0.07%  0.00%  0.00%   0 TCP Timer        
 302     1247740   120242829         10  0.07%  0.05%  0.07%   0 PIM Process      
 303        4553    47307586          0  0.07%  0.01%  0.00%   0 Mwheel Process   
 179          48      944992          0  0.07%  0.00%  0.00%   0 Track            
 280        2044       86069         23  0.07%  0.00%  0.00%   0 Syslog           
 107       51538   602283450          0  0.07%  0.08%  0.08%   0 Ethernet Msec Ti 
 320           0         176          0  0.07%  0.00%  0.00%   0 RADIUS           
 282       63499    51998773          1  0.07%  0.01%  0.00%   0 IGMP Input       
 314      158722     3068574         51  0.07%  0.00%  0.00%   0 IP SNMP          
 156           4          47         85  0.07%  0.00%  0.00%   1 Virtual Exec     
  18       21954      139747        157  0.00%  0.00%  0.00%   0 ARP Input        
  17           0           1          0  0.00%  0.00%  0.00%   0 ifIndex Receive  
  20           0           1          0  0.00%  0.00%  0.00%   0 CEF MIB API      
  16           8          12        666  0.00%  0.00%  0.00%   0 RF Slave Main Th 

显示平台健康状况

K5CpuMan Review       30.00  58.50     30     14  100  500   69  73   76  60363:49



Packets Dropped In Processing by CPU event

Event             Total                5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
Sa Miss                              9         0         0         0          0
L2 Router                            5         0         0         0          0
Input ACl Copy                     245         0         0         0          0

Packets Dropped In Processing by Priority

Priority          Total                5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
Normal                               5         0         0         0          0
Medium                               9         0         0         0          0
High                               245         0         0         0          0

Packets Dropped In Processing by Reason

Reason             Total                5 sec avg 1 min avg 5 min avg 1 hour avg
------------------ -------------------- --------- --------- --------- ----------
STPDrop                               5         0         0         0          0
Tx Mode Drop                        254         0         0         0          0

Total packet queues 64

Packets Received by Packet Queue

Queue                  Total           5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Input ACL fwd(snooping)         1388990         0         0         0          0
Host Learning                        9         0         0         0          0
L2 Control                     1117695         0         0         0          0
Input ACL log, unreach         7630372         1         0         1          0
Ttl Expired                    2383808         0         0         0          0
InputIf Fail                      1956         0         0         0          0
Adj SameIf Fail              184833357        47        27        27         23
Bfd                             187937         0         0         0          0
L2 router to CPU, 7           24908694         4         1         2          3
L3 Glean, 7                      31639         0         0         0          0
L3 Fwd, 7                  11851731684      5051      5683      4686       4430
L3 Receive, 7                  2081299         4         0         0          0

Packets Dropped by Packet Queue

Queue                  Total           5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Ttl Expired                        118         0         0         0          0
Adj SameIf Fail                  50496         0         0         0          0
L3 Glean, 7                       1365         0         0         0          0
L3 Fwd, 7                      1688513         0         0         0          0

UPD。我刚刚注意到,出于某种原因,Core2 有活动的 Pim Tunnel0,而其他路由器没有。但我不知道如何禁用它。[1]:https://i.stack.imgur.com/sXJlq.png

相关内容