Do I have a loop in this stack?


I am a brand new owner of a pair of X670V-48t switches and am a complete newb to Extreme switches and XOS but I've got over 15 years of experience with Cisco Catalyst switches and am well-versed in those.

My pair of X670V-48t switches are equipped each with a VIM4-40G4X module and are connected together as a stack with a pair of QSFP+ cables.

I had Extreme's tech support assist me with configuring the stack and was advised to connect the VIM4 ports on the back of each unit as Port 3 on Switch A to port 4 of Switch B and Port 4 of Switch A to Port 3 of Switch B.

Tech Support remoted into my PC and configured the switch stacking configs for me and the stack seems to be functioning, but part of the behavior of the switches troubles me.

All the blue LEDs on the connected ports of the VIM4 modules themselves and the blue "ST" LEDs on the front panels, and also all the link LEDs of any Ethernet port connected to a computer all blink rapid-fire, machine-gun-like *in unison* at a very high rate of speed even when there is no real network traffic. In the Cisco world, where I'm much more familiar, this unison-LED-blinking generally indicates very bad juju... that a loop exists or that some kind of broadcast or multicast packet storm is happening. Even with no computers connected to the switches, or any connections to the rest of my network (just the two switches connected to each other via the QSFP+ cables and isolated from the rest of the world) the blue LEDs on the back and front of the units still blink rapid-fire in unison.

Is this normal behavior for a stacked pair of switches in the Extreme switch world?

Here is my stacking configs:

Slot-1 SANBONE-10G.1 # show stacking
Stack Topology is a Ring
Active Topology is a Ring
Node MAC Address Slot Stack State Role Flags
------------------ ---- ----------- ------- ---
*00:04:96:97:de:07 1 Active Master CA-
00:04:96:97:de:0b 2 Active Backup CA-
* - Indicates this node
Flags: (C) Candidate for this active topology, (A) Active Node
(O) node may be in Other active topology
Slot-1 SANBONE-10G.2 # show stacking configuration
Stack MAC in use: 02:04:96:97:de:07
Node Slot Alternate Alternate
MAC Address Cfg Cur Prio Mgmt IP / Mask Gateway Flags Lic
------------------ --- --- ---- ------------------ --------------- --------- ---
*00:04:96:97:de:07 1 1 100 CcEeMm-Nn --
00:04:96:97:de:0b 2 2 90 CcEeMm-Nn --
* - Indicates this node
Flags: (C) master-Capable in use, (c) master-capable is configured,
(E) Stacking is currently Enabled, (e) Stacking is configured Enabled,
(M) Stack MAC in use, (m) Stack MACs configured and in use are the same,
(i) Stack MACs configured and in use are not the same or unknown,
(N) Enhanced protocol is in use, (n) Enhanced protocol is configured,
(-) Not in use or not configured
License level restrictions: (C) Core, (A) Advanced edge, or (E) Edge in use,
(c) Core, (a) Advanced edge, or (e) Edge configured,
(-) Not in use or not configured
Slot-1 SANBONE-10G.3 # show stacking-support

Stack Available Ports
Port Native Alternate Configured Current
----- ----------------- ---------- ----------
1 Yes 47 Native N/A
2 Yes 48 Native N/A
stacking-support: Enabled N/A

Flags: * - Current stack port selection

Slot-1 SANBONE-10G.4 # show stacking stack-ports
Stack Topology is a Ring
Slot Port Select Node MAC Address Port State Flags Speed
---- ---- ------ ----------------- ----------- ----- -----
*1 1 Native 00:04:96:97:de:07 Operational CB 40G
*1 2 Native 00:04:96:97:de:07 Operational C- 40G
2 1 Native 00:04:96:97:de:0b Operational C- 40G
2 2 Native 00:04:96:97:de:0b Operational CB 40G
* - Indicates this node
Flags: (C) Control path is active, (B) Port is Blocked
Slot-1 SANBONE-10G.5 #

14 replies

Userlevel 6
Hello Neal

From what you posted everything looks correct. Both nodes are seen in the stack and one is master and the other is backup as I would expect. It also shows that you have a ring architecture which shows that all of the ports are in use.

As for the LEDs it is possible that there is a traffic storm of some short maybe multicast etc. If you do a show port util and then hit the space bar a few times it should show the utilization in % of bandwidth. It should also show the max value.

what are those values?

P
All ports shown under the "show port util" command show 0 pkts/sec on ports in the "Ready" link state, and single digits on ports with "Active" state.

Is there some way of looking at the utilization of the physical ports of the VIM4 modules which make up the actual stack linkages?
Userlevel 7
Hi,

Can you try if the show ports stack-ports statistics can help? Not sure if that will work on non native SummitStack ports.

If so, I'm not aware of a MIB for that, so monitoring needs to go through a script.

Regards,
Stephane
Here's both a "show ports stack-ports 1:1,1:2,2:1,2:2 utilization" and "... statistics"

Link Utilization Averages Tue Oct 14 09:18:12 2014
Port Link Rx Peak Rx Tx Peak Tx
State pkts/sec pkts/sec pkts/sec pkts/sec
================================================================================
1:1 A 0 1 25 39
1:2 A 17 36 1 2
2:1 A 1 2 16 33
2:2 A 24 35 0 1

Port Statistics Tue Oct 14 09:20:48 2014 82 Port Link Tx Pkt Tx Byte Rx Pkt Rx Byte Rx Pkt Rx Pkt
State Count Count Count Count Bcast Mcast ================================================================================
1:1 A 10328976 1269304279 506419 47705262 11 8277

1:2 A 1063383 108022632 9551832 1901909882 284346 16553

2:1 A 9551876 1901915496 1063383 108022650 301135 264301

2:2 A 506419 47705262 10329021 1269309889 13 24844
Userlevel 6
Hey Neal

It doesn't appear that there is a loop. I would expect the pps to be much higher on a 10G or 40G loop. If you are still concerned I would recommend re-opening up the TAC case and have them look deeper into the stats.

Hope that helps
P
Is it normal for these switches, when in a stack configuration, to have their link LEDs (of all ports with something plugged in) all blinking rapidly... looks like an estimated 20+ times per second... all together in unison, with little to zero traffic flowing thru the switch.... even with all other cables except the stack cables unplugged ?

In every other brand of switch I've dealt with, this behavior generally denotes a serious problem condition exists.
Userlevel 6
Hello Neal

No the lights will blink based on activity.

I do agree something looks incorrect however it is hard to diagnose in this manner. I recommend re-opeing a case with GTAC. Based on the Summit HW installation guide in order to use all four of the 40G ports you must create sharing ports. See below

"For SummitStack V-320 Stacking using the VIM3-40G4X and VIM4-40G4X modules, connections"
"between the stacking ports must be done using paired bundles of physical ports. V320 stacking will not function unless the physical ports on the modules are paired to form stacking ports. The following table lists the port pairings."
"VIM3-40G4X" "VIM4-40G4X"
"Paired Stacking port physical ports" "Paired Stacking port physical ports"
"S1 and S3 S1" "S1 and S3 S1"
"S2 and S4 S2" "S2 and S4 S2"
"The following table lists the recommended order for connecting the stacking ports in the example stack"
"shown in Example SummitStack-V320 Configuration."

Can you do a show sharing to see if ports S1 and S3 are created as a share port?

Thanks
P
They are not configured as share ports. This is not a V320 stack, there are only two stacking cables, not four. ?

The cables are connected as follows:

Switch1 (Master) Switch2 (Backup)

============= ===============

S3 connected to S4

S4 connected to S3

Furthermore, when logging into the ScreenPlay WEB GUI, the stack details for all 4 stack port details across both nodes all say the same thing:

"Operational,
is Blocked, and its
Ctrl path is active"

How can the stack work if all 4 stacking ports connecting the two switches are in a state of "is Blocked"?
Userlevel 6
Hello Neal

Thanks for the information on the 320 Stack. I thought this was 320.

based on the information you provided earlier only one path is blocked.

Slot-1 SANBONE-10G.4 # show stacking stack-ports
Stack Topology is a Ring
Slot Port Select Node MAC Address Port State Flags Speed
---- ---- ------ ----------------- ----------- ----- -----
*1 1 Native 00:04:96:97:de:07 Operational CB 40G
*1 2 Native 00:04:96:97:de:07 Operational C- 40G
2 1 Native 00:04:96:97:de:0b Operational C- 40G
2 2 Native 00:04:96:97:de:0b Operational CB 40G
* - Indicates this node
Flags: (C) Control path is active, (B) Port is Blocked
Slot-1 SANBONE-10G.5 #

Which matches your S3 to S4 connection. When a stacking port is blocked it is only for Multicast and broadcast traffic.

From everything I see this is working correctly the stack is up. the two switches are seen as separate nodes, the topology is a ring.

GTAC should be able to determine if there is a loop based on other aspects of the switch. Things like CPU and Process activity as well as looking at the Show Tech.

Thanks
P
Could not get a satisfactory solution or explanation from Extreme support for these switches giving a solid continuous visual indication of suffering a loop or packet storm condition on all connected ports when configured as a 2-switch stack thru the VIM4 modules. Even though they appeared to be passing data normally, it was completely unacceptable to have all the LEDs blinking this "universal warning sign" constantly all the time, and being completely unable to discern any actual traffic patterns... or real problem conditions... from the switches' onboard visual instrumentation.

After running out of time tinkering and experimenting with them, I disconnected the QSFP+ cables, returned the switches to factory stock configs and then reconfigured them as two standalone switches and now have normal LED visual indications that follow along with the relative traffic on each port as one should expect.

It would've been nice to manage these switches as a single stack, but with only two of them, it's not that big of a deal to manage them independently. I do not really need high bandwidth intercommunication between the two, as they are to form a redundant, twin SAN backbone where everything connected to each switch only communicates to other devices plugged into that very same switch. If I ever need some high-bandwidth connectivity between these two switches, I could always create a lacp LAG between them with the VIM4 ports.
Userlevel 6
Hey Neal, I know that GTAC has not finished this testing and I have requested them to call you and help verify either the config or the operation.

Sorry this has been so frustrating. With that said if this is an area where you need redundancy there is another option which provides more protection. The feature is called MLAG and it allows two independent devices to create a Load share group to another device. Uses are for servers with two connections, one into each switch, or other switches with redundant connections into a redundant core. MLAG provides not only link and switch redundancy it also allows you the ability to upgrade the switches without taking both of them down.

P
Thanks Paul,

I was looking at the MLAG feature in the Concepts Guide earlier and it's definitely interesting. Maybe someday when I've got more time to play with these switches I can look further into it. I'm just under a time crunch right now to get the 10Gbe L2 iSCSI network ready for our EMC engineer who's coming onsite probably sometime this week to stand up our new VNXe3200 system so I can get ready to start moving LUNS off our old Celerra SAN which is filled to capacity.

I've just set up a pair of lacp LAGs to our existing Cisco 3750 1Gbe SAN Backbone stack (etherchannel lacp on that end), one to each Summit switch and that was effortless and is working perfectly at 2Gbps full duplex each LAG. I've also already moved a couple of my lesser iSCSI clients off the Cisco's and over to the Summits and they're working as expected as well. I have to wait to Friday evening after hours to move the rest of my current iSCSI clients and the Celerra server itself off the Ciscos and onto the Summits, then I can disconnect the Ciscos and re-purpose them to another network where we need them.
Userlevel 6
Thanks Neal. Let us know if we can help. We can try and look at some loaner gear for a few weeks if that will help you to get familiar with MLAG.

P
Userlevel 7
Hi Neal,
Since you already have a case open, you likely already know that the rapid LED blinking issue is expected to be resolved under CR xos0059128. I'm not sure of the timeline, but it looks like it will be included in a patch soon, as well as future releases.

-Drew

Reply