Cisco ISR4431 路由器随机重启

Cisco ISR4431 路由器随机重启

我有一台 Cisco ISR4431 互联网边缘路由器,每隔 5 天左右就会随机重启一次。每次重启后,需要 10 到 60 分钟才能恢复正常,网络流量才能正常流动。它正在运行 BGP 并为 /19 和 /20 网络路由,因此对于此类设备来说,负载应该相对较小。

我看到的唯一可疑之处是 94% 的内存已被消耗,因此我怀疑它保存的 BGP 路由比应保存的多,尽管同样的配置在旧路由器中已经运行多年,从未变得不稳定。我不太确定如何进一步诊断问题,也不知道这是硬件问题还是配置问题。

不幸的是,路由器在国家的另一边,在隔离结束之前我无法亲自到达它。

sh ver:
Cisco IOS XE Software, Version 03.16.04b.S - Extended Support Release
Cisco IOS Software, ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 15.5(3)S4b, RELEASE SOFTWARE (fc1)

sh logging
*Apr 28 14:47:09.074: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/2, changed state to up
*Apr 28 14:47:10.074: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0/2, changed state to up
*Apr 28 14:50:12.834: %PLATFORM-4-ELEMENT_WARNING:smand:  RP/0: Committed Memory value 94% exceeds warning level 90%
*Apr 28 14:52:00.253: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'fman stats bipc' took 685 msec (runtime: 256 msec) to process a 'tdl_qfpmib_throughput_data' message
*Apr 28 15:00:14.511: %PLATFORM-4-ELEMENT_WARNING:smand:  RP/0: Committed Memory value 94% exceeds warning level 90%

sh processes cpu sorted
CPU utilization for five seconds: 13%/0%; one minute: 3%; five minutes: 3%
 PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process 
 193      230311        5004      46025 12.39%  1.63%  1.22%   0 BGP Scanner      
 117       22772      228335         99  0.15%  0.10%  0.10%   0 IOSXE-RP Punt Se 
 240       31843     1902016         16  0.07%  0.14%  0.15%   0 Inline Power     
 414        2694       20294        132  0.07%  0.00%  0.00%   0 NTP              
 284       18520      605984         30  0.07%  0.09%  0.08%   0 HTTP CORE        

配置的 BGP 部分如下所示:

router bgp 7835
 no bgp log-neighbor-changes
 neighbor ZZ.ZZ.6.113 remote-as XXX
 neighbor ZZ.ZZ.6.113 password XXXXXX
 !
 address-family ipv4
  network XX.XX.160.0 mask 255.255.240.0
  network YY.YY.64.0 mask 255.255.224.0
  network YY.YY.79.0
  neighbor ZZ.ZZ.6.113 activate
  neighbor ZZ.ZZ.6.113 soft-reconfiguration inbound
  neighbor ZZ.ZZ.6.113 filter-list 1 out
 exit-address-family
!

一些进一步的诊断:

sh platform resources
**State Acronym: H - Healthy, W - Warning, C - Critical                                             
Resource                 Usage                 Max             Warning         Critical        State
----------------------------------------------------------------------------------------------------
RP0 (ok, active)                                                                               C    
 Control Processor       32.12%                100%            90%             95%             H    
  DRAM                   3849MB(99%)           3872MB          90%             95%             C    
ESP0(ok, active)                                                                               H    
 QFP                                                                                           H    
  DRAM                   1663176KB(79%)        2097152KB       80%             90%             H    
  IRAM                   0KB(0%)               0KB             80%             90%             H    

记忆

show processes memory sorted
Processor Pool Total: 1688347248 Used: 1417980160 Free:  270367088
 lsmpi_io Pool Total:    6295128 Used:    6294296 Free:        832

 PID TTY  Allocated      Freed    Holding    Getbufs    Retbufs Process
 510   0  904032136   54730248  901424352          0          0 BGP Router      
 271   0  257116280    1297600  256693920          0          0 IP RIB Update   
   0   0  352326368  108678280  227122576          0          0 *Init*          
  79   0    8209072      12176    7592984          0          0 IOSD ipc task   
 389   0    3889024       5160    3925856     799092          0 EEM ED Syslog   
 409   0    1439256      26792    1442328          0          0 EEM Server      
 155   0    3223184      91024    1057808          0          0 CWAN OIR Handler

相关内容