Strange behaviour while adding port to sharing group

  • 0
  • 1
  • Problem
  • Updated 3 years ago
  • In Progress
Hello.
I have a ring of 4 X670-48x running exos "15.5.3.4 patch1-6". Each switch are connected to its neighbour with 4x10gig sharing group links, and also to 4 cisco cat6509 with similar 4x10gig sharing group links - like this http://i.imgur.com/wC0vLYd.jpg
So last night i've tried to add empty enabled port to one sharing group between cisco and extreme.
This is the existing sharing group:
enable sharing 17 grouping 17-20 algorithm address-based L3_L4 lacp
And this was the command
conf sharing 17 add ports 21
and the whole ring crashed. I've lost access to all network and was saved only by rebooting that switch.
What did I do wrong?
Photo of Valur Hjartarson

Valur Hjartarson

  • 100 Points 100 badge 2x thumb

Posted 3 years ago

  • 0
  • 1
Photo of Husam Abu Hasan

Husam Abu Hasan

  • 60 Points
I had like this experience with sharing group between extreme and juniper. But in my case the ring crashed after plugging the new port. That was a juniper problem for mode negotiation not related to extreme. I think your case is related to xos image. try to get it upgraded to recent one.
Photo of Prashanth KG

Prashanth KG, Employee

  • 5,300 Points 5k badge 2x thumb
Hi Valur,

Before we could jump into any conclusions about what could have happened, please clarify the following:
- On the ports connecting to the CISCO, are you using STP to prevent the loop? If so, are those ports untagged or tagged in the STP VLANs?
- When the problem happened, you had mentioned that the ring crashed. Can you elaborate on that? Did you experience a situation where the secondary port was open and the loop occurred within the ring?

In my opinion, EAPS should not break for any changes that are made outside the ring ports. 
Awaiting your response! 
Photo of Brandon Clay

Brandon Clay, Escalation Support Engineer

  • 13,594 Points 10k badge 2x thumb
Hi Valur,

One more question in addition to the ones Prashanth asked. Was the switch where the LAG changes were made the EAPS master? Or was it just a transit node?

-Brandon
Photo of Valur Hjartarson

Valur Hjartarson

  • 100 Points 100 badge 2x thumb
- On the ports connecting to the CISCO, are you using STP to prevent the loop? If so, are those ports untagged or tagged in the STP VLANs?
Yes, we are using rapid-pvst and all ports are tagged.
- When the problem happened, you had mentioned that the ring crashed. Can you elaborate on that? Did you experience a situation where the secondary port was open and the loop occurred within the ring?
1. we had a problem with one physical link in LAG between top-left switch and Cisco, which serves as our main border router.
2. I rerouted traffic to our back-up router (top-right) and shut down LAG between top-left and down-left switches.
3. When i entered "conf sharing 17 add ports 21" on the top-left switch all LAGs on the top-right switch removed and then added all ports:
09/02/2015 02:59:04.46 <Info:LACP.AddPortToAggr> Add port 7 to aggregator
09/02/2015 02:59:04.46 <Info:LACP.AddPortToAggr> Add port 5 to aggregator
09/02/2015 02:59:04.46 <Info:LACP.AddPortToAggr> Add port 3 to aggregator
09/02/2015 02:59:04.46 <Info:LACP.AddPortToAggr> Add port 1 to aggregator
09/02/2015 02:59:02.14 <Info:vlan.msgs.portLinkStateUp> Port 7 link UP 
at speed 10 Gbps and full-duplex
09/02/2015 02:59:02.12 <Info:vlan.msgs.portLinkStateUp> Port 5 link UP 
at speed 10 Gbps and full-duplex
09/02/2015 02:59:02.11 <Info:vlan.msgs.portLinkStateUp> Port 3 link UP 
at speed 10 Gbps and full-duplex
09/02/2015 02:59:02.10 <Info:vlan.msgs.portLinkStateUp> Port 1 link UP 
at speed 10 Gbps and full-duplex
09/02/2015 02:56:53.27 <Info:LACP.AddPortToAggr> Add port 6 to aggregator
09/02/2015 02:56:53.27 <Info:LACP.AddPortToAggr> Add port 2 to aggregator
09/02/2015 02:56:52.69 <Info:LACP.AddPortToAggr> Add port 8 to aggregator
09/02/2015 02:56:42.67 <Info:vlan.msgs.portLinkStateDown> Port 1 link 
down - Local fault
09/02/2015 02:56:42.17 <Info:vlan.msgs.portLinkStateDown> Port 7 link 
down - Local fault
09/02/2015 02:56:41.62 
<Info:LACP.RemPortFromAggr> Remove port 5 from aggregator
09/02/2015 02:56:41.62 <Info:vlan.dbg.info> Port 5 is Down, remove from 
aggregator 1
09/02/2015 02:56:41.62 <Info:vlan.msgs.portLinkStateDown> Port 5 link 
down - Local fault
09/02/2015 02:56:41.30 <Info:vlan.msgs.portLinkStateDown> Port 3 link 
down - Local fault
09/02/2015 02:56:31.74 <Info:LACP.AddPortToAggr> Add port 18 to aggregator
09/02/2015 02:56:31.25 <Info:LACP.AddPortToAggr> Add port 19 to aggregator
09/02/2015 02:56:24.14 <Info:LACP.AddPortToAggr> Add port 17 to aggregator
09/02/2015 02:56:22.40 <Info:LACP.RemPortFromAggr> Remove port 18 from 
aggregator
09/02/2015 02:56:21.72 <Info:LACP.RemPortFromAggr> Remove port 19 from 
aggregator
09/02/2015 02:56:15.95 <Info:LACP.RemPortFromAggr> Remove port 17 from 
aggregator
09/02/2015 02:55:41.12 <Info:LACP.AddPortToAggr> Add port 20 to aggregator
09/02/2015 02:55:15.54 <Info:LACP.RemPortFromAggr> Remove port 20 from 
aggregator
09/02/2015 02:55:03.77 <Info:LACP.AddPortToAggr> Add port 5 to aggregator
09/02/2015 02:55:00.65 <Info:LACP.RemPortFromAggr> Remove port 5 from 
aggregator
09/02/2015 02:54:52.16 <Info:LACP.RemPortFromAggr> Remove port 8 from 
aggregator
09/02/2015 02:54:51.96 <Info:LACP.RemPortFromAggr> Remove port 6 from 
aggregator
09/02/2015 02:54:51.89 <Info:LACP.RemPortFromAggr> Remove port 2 from 
aggregator
09/02/2015 02:54:51.17 <Info:LACP.RemPortFromAggr> Remove port 3 from 
aggregator
09/02/2015 02:53:58.45 <Info:LACP.AddPortToAggr> Add port 19 to aggregator
09/02/2015 02:53:52.97 <Info:LACP.RemPortFromAggr> Remove port 19 from 
aggregator
09/02/2015 02:53:27.71 <Info:LACP.AddPortToAggr> Add port 5 to aggregator
09/02/2015 02:53:25.99 <Info:LACP.RemPortFromAggr> Remove port 5 from 
aggregator
09/02/2015 02:53:25.67 <Info:LACP.RemPortFromAggr> Remove port 7 from 
aggregator
09/02/2015 02:53:19.04 <Info:LACP.AddPortToAggr> Add port 6 to aggregator
09/02/2015 02:53:18.37 
<Info:LACP.AddPortToAggr> Add port 18 to aggregator
09/02/2015 02:53:18.35 <Info:LACP.AddPortToAggr> Add port 2 to aggregator
09/02/2015 02:53:16.06 <Info:LACP.RemPortFromAggr> Remove port 18 from 
aggregator
09/02/2015 02:52:13.93 <Info:LACP.AddPortToAggr> Add port 17 to aggregator
09/02/2015 02:51:59.94 <Info:LACP.AddPortToAggr> Add port 19 to aggregator
09/02/2015 02:51:52.95 <Info:LACP.RemPortFromAggr> Remove port 19 from 
aggregator
09/02/2015 02:51:51.83 <Info:LACP.RemPortFromAggr> Remove port 17 from 
aggregator
09/02/2015 02:51:51.76 <Info:LACP.RemPortFromAggr> Remove port 1 from 
aggregator
09/02/2015 02:51:51.76 <Info:LACP.RemPortFromAggr> Remove port 6 from 
aggregator
09/02/2015 02:51:50.37 <Info:LACP.RemPortFromAggr> Remove port 2 from 
aggregator

and that totally broke our internet connectivity.

One more question in addition to the ones Prashanth asked. Was the switch where the LAG changes were made the EAPS master? Or was it just a transit node?
We have pretty basic configurtion - just LAGs and STP. So EAPS is disabled on all switches.
(Edited)