Summit X450-48t suddenly reboots with Crit:Kern:Alert

  • 0
  • 1
  • Problem
  • Updated 3 years ago
  • Solved

In the last days i had a Problem with a suddenly reboot of one X450-48t Switch.

Is this a Problem with the Firmware? The Switch was running about 900 days without Problems. We are using the  ExtremeXOS version 15.3.1.4 v1531b4-patch1-19.

Is an Upgrade required?

Thx for your Help

The Log

11/27/2015 13:17:37.61 <Info:pm.config.loaded> Loaded Policy: snmp number of entries 1
11/27/2015 13:17:37.61 <Info:pm.config.openingFile> Loading policy snmp from file /config/snmp.pol
11/27/2015 13:17:36.30 <Noti:EPM.system_stable> System is stable. Change to warm reset mode
11/27/2015 13:17:26.60 <Info:EPM.wdg_enable> Watchdog enabled
11/27/2015 13:17:12.05 <Info:SNMP.Master.InitDone> snmpMaster initialization complete
11/27/2015 13:17:11.95 <Info:DOSProt.Init> DOS protect application started successfully
11/27/2015 13:17:11.13 <Info:tftpd.info> **** tftpd started *****
11/27/2015 13:17:11.12 <Info:telnetd.info> **** telnetd started *****
11/27/2015 13:17:03.84 <Noti:DM.Notice> Node State[3] = OPERATIONAL
11/27/2015 13:17:00.34 <Noti:DM.Notice> Node State[2] = STANDBY
11/27/2015 13:17:00.34 <Info:DM.Info> Node INIT DONE ....
11/27/2015 13:16:59.05 <Noti:DM.Notice> Node State[1] = INIT
11/27/2015 13:16:58.66 <Info:HAL.Sys.Info> Hal initialization done.
11/27/2015 13:16:56.94 <Info:nl.init> Network Login framework has been initialized
11/27/2015 13:16:56.67 <Info:SNMP.Subagent.InitDone> snmpSubagent initialization complete
11/27/2015 13:16:49.75 <Info:HAL.Sys.Info> Starting hal initialization ....
11/27/2015 13:16:45.25 <Info:telnetd.info> telnetd listening on port 23

11/27/2015 13:16:37.75 <Noti:DM.Notice> DM started
11/27/2015 13:16:37.62 <Noti:NM.StrtProc> The Node Manager (NM) has started processing.
11/27/2015 13:16:36.90 <Noti:EPM.start> EPM Started
11/27/2015 13:16:35.51 <Noti:EPM.wd_warm_reset> Changing to watchdog warm reset mode
11/27/2015 13:15:10.23 <Warn:DM.Warning> Switch FAILED (1) Process Failure
11/27/2015 13:15:10.19 <Warn:EPM.all_shutdown> Shutting down all processes
11/27/2015 13:15:09.56 <Erro:DM.Error> Process snmpMaster Failed
11/27/2015 13:15:09.56 <Erro:DM.Error> Process snmpMaster Failed
11/27/2015 13:15:09.55 <Erro:DM.Error> Node State[4] = FAIL (Process Failure)
11/27/2015 13:15:09.55 <Warn:EPM.reboot> Rebooting with reason
11/27/2015 13:15:09.55 <Erro:EPM.crash_rate> Process snmpMaster exceeded pre-configured or default crash rate
11/27/2015 13:15:09.46 <Warn:cm.database> Configuration database locked
11/27/2015 13:15:09.45 <Erro:EPM.proc_conn_lost> Connection lost with process snmpMaster
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79170  8f998490 lw     t9,-31600(gp)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab7916c  8e04004c lw     a0,76(s0)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79168  8fbc0018 lw     gp,24(sp)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79164  8c440000 lw     a0,0(v0)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79160  0320f809 jalr   t9
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab7915c <8c450004>lw     a1,4(v0)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79158  8c620048 lw     v0,72(v1)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79154  8f99823c lw     t9,-32196(gp)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> 2ab79150  8fa300e0 lw     v1,224(sp)
11/27/2015 13:15:06.63 <Crit:Kern.Alert> Code:
11/27/2015 13:15:06.63 <Crit:Kern.Alert>
11/27/2015 13:15:06.63 <Crit:Kern.Alert> Process snmpMaster pid 9557 died with signal 11
11/27/2015 13:14:57.51 <Erro:cm.sys.actionErr> Error while loading "snmpTargetAddrEntryCLI": Source IP address does not belong to the switch
11/27/2015 13:14:57.51 <Erro:cm.sys.actionErr> Error while loading "snmpTargetAddrEntryCLI": Source IP address does not belong to the switch
11/27/2015 13:14:55.40 <Warn:cm.database> Configuration database unlocked
11/27/2015 13:14:54.93 <Warn:cm.database> Configuration database locked
11/27/2015 13:14:54.92 <Erro:EPM.proc_conn_lost> Connection lost with process snmpMaster
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79170  8f998490 lw     t9,-31600(gp)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab7916c  8e04004c lw     a0,76(s0)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79168  8fbc0018 lw     gp,24(sp)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79164  8c440000 lw     a0,0(v0)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79160  0320f809 jalr   t9
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab7915c <8c450004>lw     a1,4(v0)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79158  8c620048 lw     v0,72(v1)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79154  8f99823c lw     t9,-32196(gp)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> 2ab79150  8fa300e0 lw     v1,224(sp)
11/27/2015 13:14:51.73 <Crit:Kern.Alert> Code:
11/27/2015 13:14:51.73 <Crit:Kern.Alert>
11/27/2015 13:14:51.71 <Crit:Kern.Alert> Process snmpMaster pid 1655 died with signal 11

Photo of HarrySo

HarrySo

  • 482 Points 250 badge 2x thumb

Posted 3 years ago

  • 0
  • 1
Photo of Sumit Tokle

Sumit Tokle, Alum

  • 5,738 Points 5k badge 2x thumb
It's clearly indicate that snmpMaster process died. You may need to look for the know bug in that software.
Open TAC case, they will help you out in order to narrow down the issue.
Photo of Rahmathullah, Syed Nishath

Rahmathullah, Syed Nishath, Employee

  • 486 Points 250 badge 2x thumb
Hi HarrySo,
      Yes, Like Sumit said please open a case with GTAC along with "show tech" and core dump which is expected to be available in internal memory with extension ".gz"

Thanks,Syed
Photo of HarrySo

HarrySo

  • 482 Points 250 badge 2x thumb
thx guys! I open a case at our Distributor
Photo of Patrick Voss

Patrick Voss, Alum

  • 11,594 Points 10k badge 2x thumb
What version of code are you running? Process crashes are almost always software related. If you are running on unsupported code the recommendation is going to be upgrade to a supported version.
Photo of HarrySo

HarrySo

  • 482 Points 250 badge 2x thumb
The Switch was running with 15.3.1.4 patch 1-7 for 900 days, now the Switch runs with 15.3.1.4 patch 1-19.