XOS 12.6.1.3 stack crashes after 1555 days uptime

  • 0
  • 1
  • Problem
  • Updated 3 years ago
  • Solved
We have two Extreme stacks that have shown the same behavior at 1555 days uptime:

Slave switch breaks out of stack and becomes 2nd master,
All ports on slave switch go dark.  (they can be bounced to re-enable directly from switch 2)

Has anyone else experienced this?  I just love uptime bugs.....
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb

Posted 3 years ago

  • 0
  • 1
Photo of Dorian Perry

Dorian Perry, Employee

  • 2,300 Points 2k badge 2x thumb
Hi Matthew,

What type of switches are in the stack and what EXOS version are they running?
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb
X460V-48T in stack 1 and X670V-48x in stack 2.  Both are on XOS 12.6.1.3
Photo of Drew C.

Drew C., Community Manager

  • 37,366 Points 20k badge 2x thumb
Edit - should've refreshed before posting.  Thanks Matthew
---
That's an odd one, Matthew.  I don't think we've seen it before.
12.6 is unsupported now but we can test and see if it's something that has been fixed in a later version.  Can you give me some more information about your gear?

show switch and show slot should be enough for now.
(Edited)
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb
Sure, here you go - Ill paste both switches outputs for the X670s so you can see the difference now:

Switch 1

Slot-1 stor-cwc-sw1.iad.1 # sh switch
SysName:          stor-cwc-sw1.iad
SysLocation:
SysContact:       support@extremenetworks.com, +1 888 257 3000
System MAC:       02:04:96:52:9B:13
System Type:      X670V-48x (Stack)

SysHealth check:  Enabled (Normal)
Recovery Mode:    All
System Watchdog:  Enabled

Current Time:     Fri Dec  4 12:58:37 2015
Timezone:         [Auto DST Enabled] GMT Offset: -360 minutes, name is CST.
                  DST of 60 minutes is currently not in effect, name is EDT.
                  DST begins every second Sunday March at 2:00
                  DST ends every first Sunday November at 2:00

Boot Time:        Thu Sep  1 05:18:03 2011
Boot Count:       6
Next Reboot:      None scheduled
System UpTime:    1555 days 8 hours 40 minutes 34 seconds

Slot:             Slot-1 *                     No Backup
                  ------------------------     ------------------------
Current State:    MASTER

Image Selected:   secondary
Image Booted:     secondary
Primary ver:      12.6.0.31
Secondary ver:    12.6.1.3

Config Selected:  primary.cfg
Config Booted:    primary.cfg

primary.cfg       Created by ExtremeXOS version 12.6.1.3
                  401039 bytes saved on Mon Aug 10 09:28:54 2015
Slot-1 stor-cwc-sw1.iad.2 # sh slot
Slots    Type                 Configured           State       Ports
--------------------------------------------------------------------
Slot-1   X670V-48x            X670V-48x            Operational   64
Slot-2   X670V-48x            X670V-48x            Failed        64
Slot-3                                             Empty          0
Slot-4                                             Empty          0
Slot-5                                             Empty          0
Slot-6                                             Empty          0
Slot-7                                             Empty          0
Slot-8                                             Empty          0

Slot-1 stor-cwc-sw1.iad.3 #


Switch 2:

* Slot-2 stor-cwc-sw1.iad.1 # sh switch
SysName:          stor-cwc-sw1.iad
SysLocation:
SysContact:       support@extremenetworks.com, +1 888 257 3000
System MAC:       02:04:96:52:9B:13
System Type:      X670V-48x (Stack)

SysHealth check:  Enabled (Normal)
Recovery Mode:    All
System Watchdog:  Enabled

Current Time:     Fri Dec  4 12:59:34 2015
Timezone:         [Auto DST Enabled] GMT Offset: -360 minutes, name is CST.
                  DST of 60 minutes is currently not in effect, name is EDT.
                  DST begins every second Sunday March at 2:00
                  DST ends every first Sunday November at 2:00

Boot Time:        Thu Sep  1 18:29:15 2011
Boot Count:       6
Next Reboot:      None scheduled
System UpTime:    1555 days 8 hours 40 minutes 51 seconds

Slot:             Slot-2 *                     No Backup
                  ------------------------     ------------------------
Current State:    MASTER

Image Selected:   secondary
Image Booted:     secondary
Primary ver:      12.6.0.31
Secondary ver:    12.6.1.3

Config Selected:  primary.cfg
Config Booted:    primary.cfg

primary.cfg       Created by ExtremeXOS version 12.6.1.3
                  401039 bytes saved on Mon Aug 10 09:28:54 2015
* Slot-2 stor-cwc-sw1.iad.2 # sh slot
Slots    Type                 Configured           State       Ports
--------------------------------------------------------------------
Slot-1   X670V-48x            X670V-48x            Failed        64
Slot-2   X670V-48x            X670V-48x            Operational   64
Slot-3                                             Empty          0
Slot-4                                             Empty          0
Slot-5                                             Empty          0
Slot-6                                             Empty          0
Slot-7                                             Empty          0
Slot-8                                             Empty          0
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb
Here are the X460s:

Switch 1:

Slot-1 aggr-cwc-sw1.1 # sh sw
SysName:          aggr-cwc-sw1
SysLocation:
SysContact:       support@extremenetworks.com, +1 888 257 3000
System MAC:       02:04:96:51:F9:C6
System Type:      X460-48t (Stack)

SysHealth check:  Enabled (Normal)
Recovery Mode:    All
System Watchdog:  Enabled

Current Time:     Sun Nov 29 15:58:05 2015
Timezone:         [Auto DST Enabled] GMT Offset: -360 minutes, name is CST.
                  DST of 60 minutes is currently not in effect, name is EDT.
                  DST begins every second Sunday March at 2:00
                  DST ends every first Sunday November at 2:00

Boot Time:        Mon Aug 22 21:52:09 2011
Boot Count:       5
Next Reboot:      None scheduled
System UpTime:    1559 days 19 hours 5 minutes 55 seconds

Slot:             Slot-1 *                     No Backup
                  ------------------------     ------------------------
Current State:    MASTER

Image Selected:   secondary
Image Booted:     secondary
Primary ver:      12.5.0.14
Secondary ver:    12.6.1.3

Config Selected:  primary.cfg
Config Booted:    primary.cfg

primary.cfg       Created by ExtremeXOS version 12.6.1.3
                  582582 bytes saved on Sun Nov 29 14:27:51 2015
Slot-1 aggr-cwc-sw1.2 # sh sl
Slots    Type                 Configured           State       Ports
--------------------------------------------------------------------
Slot-1   X460-48t             X460-48t             Operational   54
Slot-2   X460-48t             X460-48t             Failed        54
Slot-3                                             Empty          0
Slot-4                                             Empty          0
Slot-5                                             Empty          0
Slot-6                                             Empty          0
Slot-7                                             Empty          0
Slot-8                                             Empty          0






Switch 2:

* Slot-2 aggr-cwc-sw1.1 # sh sw
SysName:          aggr-cwc-sw1
SysLocation:
SysContact:       support@extremenetworks.com, +1 888 257 3000
System MAC:       02:04:96:51:F9:C6
System Type:      X460-48t (Stack)

SysHealth check:  Enabled (Normal)
Recovery Mode:    All
System Watchdog:  Enabled

Current Time:     Sun Nov 29 15:54:34 2015
Timezone:         [Auto DST Enabled] GMT Offset: -360 minutes, name is CST.
                  DST of 60 minutes is currently not in effect, name is EDT.
                  DST begins every second Sunday March at 2:00
                  DST ends every first Sunday November at 2:00

Boot Time:        Tue Aug 23 16:05:18 2011
Boot Count:       5
Next Reboot:      None scheduled
System UpTime:    1559 days 19 hours 2 minutes 18 seconds

Slot:             Slot-2 *                     No Backup
                  ------------------------     ------------------------
Current State:    MASTER

Image Selected:   secondary
Image Booted:     secondary
Primary ver:      12.5.0.14
Secondary ver:    12.6.1.3

Config Selected:  primary.cfg
Config Booted:    primary.cfg

primary.cfg       Created by ExtremeXOS version 12.6.1.3
                  580097 bytes saved on Fri Nov 20 03:14:46 2015
* Slot-2 aggr-cwc-sw1.2 # sh sl
Slots    Type                 Configured           State       Ports
--------------------------------------------------------------------
Slot-1   X460-48t             X460-48t             Failed        54
Slot-2   X460-48t             X460-48t             Operational   54
Slot-3                                             Empty          0
Slot-4                                             Empty          0
Slot-5                                             Empty          0
Slot-6                                             Empty          0
Slot-7                                             Empty          0
Slot-8                                             Empty          0
Photo of Drew C.

Drew C., Community Manager

  • 37,366 Points 20k badge 2x thumb
Thanks - now I need to see how I can fake uptime in EXOS so we don't have to wait 4 years :)

Meanwhile, I would suggest a reboot of the stack to bring things back in proper order.  Go ahead and grab the output of "show tech all" from both stacks to have on hand.  I would also recommend opening a case with GTAC.  Just know that one of the first things they're going to ask you to do is upgrade your stacks to a supported version.

Once a case is opened, GTAC can help track this to resolution.
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb
Thanks Drew - I already have a TAC case open but nothing conclusive has come of it.  I turned over the sh tech output to them yesterday and they've sorted through it.

I'm not able to reboot the stack, but will work on a window to reboot each of the slave switches and let you know what happens.  That was also supports suggestion.
Photo of Drew C.

Drew C., Community Manager

  • 37,308 Points 20k badge 2x thumb
Excellent. I found your case in the system and will see if I can or need to work with the owner to help replicate this.
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb
Hey, I appreciate it!  This is my first time posting anyting to the hub and only my 2nd ticket in years with Extreme, so dunno if I went about it backwards, but I do appreciate the assistance.
Photo of Drew C.

Drew C., Community Manager

  • 37,308 Points 20k badge 2x thumb
No worries.  It's okay that you've gone through two of our support channels.  Since there is a case opened, most of your updates are going to come from the case owner.
If this is, in fact, a software issue (and it seems that it is) it's a rare one.  Most systems get at least one reboot for some reason or other in 4 years - so I'm impressed by that!
Photo of Matthew Tedder

Matthew Tedder

  • 130 Points 100 badge 2x thumb
Ha, yeah, understood - these switches are absolutely critical to us but even I was surprised by that.  We'll take this as an opportunity to update them once we review what the newest rock-solid version is.