1

I have a Cisco ISR4431 acting internet edge router that has been randomly rebooting every 5 days or so. When it reboots it takes anywhere from 10-60 minutes before it is back up and network traffic is flowing normally. It is running BGP and routing for a /19 and /20 network so it should be a relatively small load for this class of box.

The only suspicious thing I see is 94% of the memory is consumed, so I suspect it is holding more BGP routes than it should, though this same config has been working in an older router for years without becoming unstable. I'm not really sure how to diagnose the issue further and I don't know if this is a hardware of config problem.

Unfortunately the router is on the other side of the country and I have no way of physically getting to it until the quarantine is over.

sh ver:
Cisco IOS XE Software, Version 03.16.04b.S - Extended Support Release
Cisco IOS Software, ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 15.5(3)S4b, RELEASE SOFTWARE (fc1)

sh logging
*Apr 28 14:47:09.074: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/2, changed state to up
*Apr 28 14:47:10.074: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0/2, changed state to up
*Apr 28 14:50:12.834: %PLATFORM-4-ELEMENT_WARNING:smand:  RP/0: Committed Memory value 94% exceeds warning level 90%
*Apr 28 14:52:00.253: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'fman stats bipc' took 685 msec (runtime: 256 msec) to process a 'tdl_qfpmib_throughput_data' message
*Apr 28 15:00:14.511: %PLATFORM-4-ELEMENT_WARNING:smand:  RP/0: Committed Memory value 94% exceeds warning level 90%

sh processes cpu sorted
CPU utilization for five seconds: 13%/0%; one minute: 3%; five minutes: 3%
 PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process 
 193      230311        5004      46025 12.39%  1.63%  1.22%   0 BGP Scanner      
 117       22772      228335         99  0.15%  0.10%  0.10%   0 IOSXE-RP Punt Se 
 240       31843     1902016         16  0.07%  0.14%  0.15%   0 Inline Power     
 414        2694       20294        132  0.07%  0.00%  0.00%   0 NTP              
 284       18520      605984         30  0.07%  0.09%  0.08%   0 HTTP CORE        

The BGP section of the config looks like this:

router bgp 7835
 no bgp log-neighbor-changes
 neighbor ZZ.ZZ.6.113 remote-as XXX
 neighbor ZZ.ZZ.6.113 password XXXXXX
 !
 address-family ipv4
  network XX.XX.160.0 mask 255.255.240.0
  network YY.YY.64.0 mask 255.255.224.0
  network YY.YY.79.0
  neighbor ZZ.ZZ.6.113 activate
  neighbor ZZ.ZZ.6.113 soft-reconfiguration inbound
  neighbor ZZ.ZZ.6.113 filter-list 1 out
 exit-address-family
!

Some further diagnostics:

sh platform resources
**State Acronym: H - Healthy, W - Warning, C - Critical                                             
Resource                 Usage                 Max             Warning         Critical        State
----------------------------------------------------------------------------------------------------
RP0 (ok, active)                                                                               C    
 Control Processor       32.12%                100%            90%             95%             H    
  DRAM                   3849MB(99%)           3872MB          90%             95%             C    
ESP0(ok, active)                                                                               H    
 QFP                                                                                           H    
  DRAM                   1663176KB(79%)        2097152KB       80%             90%             H    
  IRAM                   0KB(0%)               0KB             80%             90%             H    

Memory

show processes memory sorted
Processor Pool Total: 1688347248 Used: 1417980160 Free:  270367088
 lsmpi_io Pool Total:    6295128 Used:    6294296 Free:        832

 PID TTY  Allocated      Freed    Holding    Getbufs    Retbufs Process
 510   0  904032136   54730248  901424352          0          0 BGP Router      
 271   0  257116280    1297600  256693920          0          0 IP RIB Update   
   0   0  352326368  108678280  227122576          0          0 *Init*          
  79   0    8209072      12176    7592984          0          0 IOSD ipc task   
 389   0    3889024       5160    3925856     799092          0 EEM ED Syslog   
 409   0    1439256      26792    1442328          0          0 EEM Server      
 155   0    3223184      91024    1057808          0          0 CWAN OIR Handler
John P
  • 1,679
  • 6
  • 38
  • 59
  • I does look like you're running out of memory. Are you receiving the full BGP table? Please edit your question to include the output of "show ip bgp summary" – Ron Trunk Apr 29 '20 at 13:01
  • I have same issue, but my 4331 freezes right after getting routes from neighbor. Did you solved it somehow? – kab00m Feb 02 '21 at 23:46

0 Answers0